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COMPOSITIONS FOR IHCRXAS1D BIOAVAILABILITY OF 
ORALLY DKLIVXRID THIRAPIOTIC AGXNTS 

5 FIELD OF THX INVXHTION 

The present invention relates to the enhancement 
of the bioavailability of orally delivered therapeutic 
agents. In particular, the invention involves 

10 improving the bioavailability of therapeutic agents by 
combining them with a suitable transport promoter 
which is capable of facilitating the penetration of 
the therapeutic agent across epithelial and 
endothelial cell barriers. The transport promoter of 

15 the present invention is preferably an invasion 

proficient bacterial coat protein which, when combined 
with a therapeutic agent/ can effectuate the 
penetration of the therapeutic agent through the 
gastrointestinal lining . 

20 

BACKGROUND OF TBS IWZMTXOH 

The common routes of therapeutic agent 
25 administration are enteral (oral) and parental 

(intravenpus, subcutaneous, and intramuscular) routes 
of administration. The intravenous route is 
advantageous for emergency use when a very rapid and 
predictable increase in blood level of the therapeutic 
30 agent is necessary. In addition, the intravenous 

route allows for easy dosage adjustments and is useful 
for administering large volumes of a drug. 
Intravenous drug administration, however, has several 
limitations. One problem is the risk of adverse 
35 effects resulting from the rapid accumulation of a 

high concentration of the therapeutic agent in plasma 
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and/or tissues. Also, repeated injections by the 
— intravenous route may cause discomfort to the patient. 
In addition, the delivery is inconvenient as often it 
is administered by a health care provider. 
5 The oral administration of a therapeutic agent ; ~ 

generally more convenient, economical and acceptable. 
Oral delivery is by far the most popular delivery 
method where the drug is intended to be absorbed by 
the gastrointestinal tract. There are, however, 
10 several problems associated with the oral delivery of 
therapeutic agents. For example, oral administration 
is limited when the therapeutic agent is not 
efficiently absorbed by the gastrointestinal tract. 
Unlike the administration of a therapeutic agent by 
15 injection, which circumvents the highly protective 
barriers of the human body, the absorption of a 
therapeutic agent by the gastrointestinal tract may be 
inefficient for poorly soluble, slowly absorbed, or 
unstable therapeutic preparations. As a result, many 
20 important therapeutic agents, which are not 

effectively absorbed when administered orally, are 
currently delivered by injectiori. 

In particular, the delivery of polypeptide and 
protein therapeutic agents via the gastrointestinal 
25 tract is especially difficult because of the inherent 
instability of such materials and the poor 
permeability of the intestinal mucosa to high 
molecular weight substances. The gastrointestine is 
an organ of the body that is specifically developed to 
30 physically, chemically and enzymatically break down 
ingested nutrients. The gastrointestine is also 
responsible for the uptake of nutrients into the body 
and for the elimination of waste. The gastro- 
intestinal tract includes the stomach and intestine. 
35 The stomach is specifically designed for the digestion 
of nutrients, the stimulation of other regions of the 
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gut to secrete, the storage of food, and the release 
of chyme into the intestine at a controlled rate. 
Nutrient uptake is not an important function of the 
stomach. The small intestine includes the duodenum, 
5 jejunum and ileum. Distal to the stomach is the 
duodenum, where neutralization of the acidic chyme 
occurs. Surfactants for lipid digestion and proteases 
for protein breakdown are also secreted into the 
duodenum. There is little absorption in this section 
10 of the gut. Uptake of the nutrient breakdown products 
mainly occurs in the lower small intestine: the 
jejunum and the ileum are 2.8 meters and 4.2 meters in 
length respectively, and have a combined surface area 
of 460 m2. 

15 The large intestine, which is composed of the 

cecum and the colon, is responsible for the storage of 
waste, and also for water and salt balance. There is 
little enzyme activity in this section of the gut, and 
it is the least permeable section of the 
20 gastrointestinal tract. 

The majority of the surface, of the small and 
large intestine is lined by a layer of epithelial 
cells called the enterocytes, which are specialized 
villus absorptive cells. The lining of the gut is 
also composed of a mucus lining which acts as an 
unstirred water layer (1) . The mucus is a barrier to 
macromolecules with a molecular weight greater than 17 
KDa (2) . The enterocyte lining forms a tight lipid 
barrier to peptides having a molecular weight as low 
30 as 500 Da (3) . Therefore, the lining of the gut is 
composed of an efficient barrier to both lipophilic 
and hydrophilic molecules due to the mucus and the 
enterocyte linings, respectively. The oral 
administration of a large, macromolecular therapeutic 
35 agent is, therefore, very limited by the barrier 
effect of the gastrointestinal lining. This is 
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certainly true of the recombinant therapeutic 
proteins. 

The gastrointestinal tract, however, cannot be a 
complete barrier to all macromolecules because many 
5 macromolecules are required for nutrient intake. 

These include, among others, amino acids, glucose and 
vitamins. For such molecules, specific transport ^ 
mechanisms exist. Amino acids and glucose are taken 
up by transporters situated in the lumenal or apical 
10 membrane domains of the enterocytes. Receptors for 
vitamin uptake are also present in the apical domain 
of the enterocyte lining. 

In addition, certain microorganisms, including 
both viruses (<100 nm in diameter) and bacteria Oljim 
15 in diameter) , are able to invade the body from the gut 
by crossing the epithelial barrier. Certain cells of 
the immune system, including neutrophils and 
macrophages, are also able to permeate both epithelial 
and endothelial barriers. 
20 Bacteria that invade the enterocyte barrier 

include, Yersinia, Salmonella, Shigella and Listeria. 
In the case of Yersinia, the method of attachment to 
the cell surface and invasion into the cell has been 
characterized. In Yersinia pseudotuberculosis and in 
25 Yersinia enterocolitica, a protein termed invasin 

(INV) is expressed on the surface of the bacteria. It 
has been shown that the INV protein is able to bind to 
the Pi integrin family of receptors (4, 5). The 
integrin receptor family belongs to a group of 
30 molecules termed the adhesion receptors and is 
involved in promoting cell attachment to the 
extracellular matrix (6) . Following binding of the 
INV protein to the cell, internalization of the 
protein occurs (7) . This event has been demonstrated 
35 in HEp-2 cells, which are epithelial-like cells from 
the larynx, and in some other epithelial cells. The 
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invasion event has not been demonstrated in the- 
enterocyte cells. 

Another invasion-mediating protein identified in 
Yersinia enterocolitica has been termed the AIL 
5 protein (for attachment-invasion-locus) (8) . The 
receptor utilized by this protein is as yet unknown, 
and as with INV / the binding and invasion event has 
not been demonstrated for gut epithelium. 

In vivo studies have shown that Yersinia can 
10 invade the body from the gut through the Peyers 

Patches (9, 10). No studies have shown that the INV 
and AIL proteins are able to mediate binding and 
invasion of the enterocytes lining the gut. 

The delivery of a therapeutic agent through the 
15 enterocyte lining would be preferable, as compared to 
Peyers Patch uptake, because the latter are known to 
be variable from species to species and between 
individuals of the same species. In addition, 
materials delivered through the Peyers Patch are more 
20 effectively delivered as an antigen. 

CURRENT METHODS OF DRUG DELIVERY 

The efficacy of an orally administered 
25 therapeutic agent depends on the agent being absorbed 
from the gastrointestinal tract into the circulation. 
The permeability barrier of the gut epithelium is 
perhaps the most limiting factor to the reproducible 
oral absorption of therapeutic agents. 
30 One previous attempt to circumvent non-parental 

bioavailability problems involved intranasal 
administration of a therapeutic agent. Investigators 
have also attempted to pass therapeutic agents across 
the skin through the use of chelating agents, bile 
35 salts and surfactants. Similar materials have been 
used to increase the absorption of therapeutic agents 
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from the gastrointestinal tract (11) . Other 
investigators have attempted to increase 
bioavailability from the gastrointestinal tract 
through the use of liposome-entrapped therapeutic 
5 agents. 

Liposomes have also been used as a means for 
target-specific delivery of an encapsulated - / 
biologically active material. Liposomes have been 
attached to materials such as viral membrane proteins/ 

10 antibodies, streptavidin, transferrin and other 

ligands as a means of directing the therapeutic agent 
to the target cell (12) . The results of such delivery 
methods, however, have not demonstrated that the 
liposome is an effective means for promoting the 

15 bioavailability of orally administered proteins. In 
fact, liposomes alone or attached to such site- 
specific ligands are unlikely to facilitate absorption 
of orally delivered agents because liposomes typically 
are degraded in the lumen of the gut. 

20 Invasive microorganisms have been used to 

transfer materials into host cells. Isberg et al. 
(13) describe the genetic transfer of INV or AIL genes 
into a microorganism to impart an invasive phenotype 
to that microorganism. The modified microorganism is 

25 then used as a vaccine to introduce a pathogen of 
interest into a host cell. While this technique 
describes the introduction of exogenous IIJV and AIL 
genes to impart an invasive capability on a 
microorganism, there is no provision for increasing 

30 the bioavailability of a therapeutic agent or 

improving the transport of a therapeutic agent through 
a mucosal barrier. 

Another delivery technique involves nanosphere 
and microsphere technology (14, 15) . This technology 

35 is based upon the observed uptake of such microspheres 
into the body through the M cells of the ^eyers 
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_ Patches in the gastrointestinal tract. There is, 
however, no moiety involved that would enhance the 
uptake of such particles. The delivery of a 
therapeutic agent through the Peyers Patches is not an 
5 efficient way to orally deliver non-vaccine based 

therapeutics, A material delivered by this route may 
be presented to the body as an antigen, and this is 
not a desired attribute for a non-vaccine therapeutic 
agent . 

10 Another previously available delivery technique 

involves the use of proteinoid technology (17) . 
Orally administered delivery systems for insulin, 
heparin and physostigmine include the use of 
encapsulating spheres which are predominantly less 

15 than 10 microns (i*m) in diameter and made of 

artificial polypeptides. The proteinoids are intended 
to pass through the gastrointestinal mucosa and 
thereby deliver a therapeutic agent. One very 
apparent problem with this system is that the 

20 protenoids release the drug component under neutral 
conditions. Because such condition^ are found in the 
gut, especially in the lower small intestine (i.e., 
ileum) , it would be expected that the proteinoids 
mainly would release the therapeutic agent into the 

25 lumen of the gut rather than transport the therapeutic 
agent across the gastrointestinal lining. 

Another drug delivery technique involves 
receptor-mediated transcytosis, wherein the amino acid 
sequences of various growth factors are incorporated 

30 into the system (i.e., epidermal growth factor and 
transforming growth factor alpha) (48) . Chimeric 
molecules or fusion peptides are formed by conjugating 
the growth factor to a desired protein. The proposed 
chimeric molecules are transcytosed across epithelial 

35 cells via an interaction with growth factor receptors. 
The chimeric molecule system, however, fails to 
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10 



15 



30 



provide for the protection of the therapeutic against 
the gut environment. Moreover, this delivery 
technique would be dependent on a receptor system 
which is normally present at low levels on the apical 
or lumenal domain of the enterocyte. The binding and 
uptake of growth factors from the lumen of the gut is 
a non-physiological event. 

Notwithstanding the above-noted developments in 
the arts of cell targeting and drug delivery, it is 
clear that there is a need for novel compositions 
which enhance the bioavailability of an orally 
delivered therapeutic agent. It is not sufficient to 
merely bind the drug to a target cell. 



SUMMARY OF THX INVENTION 



A major problem associated with the oral delivery 
of a therapeutic agent is the hostile environment of 
20 the gut, especially to protein and peptide 

therapeutics. Another problem is the impermeability 
of the mucosal barrier in the gut , ( especially to large 
molecular weight materials. 

It is an object of the present invention to 
25 increase the bioavailability of orally delivered 
therapeutic agents, particularly polypeptides and 
proteins, by providing for the improved transport of 
such therapeutics across the body's epithelial 
barriers. It is a further object of the present 
invention to provide a delivery system wherein the 
delivery means or transport enhancer is not readily 
subject to degradation in the gut or prone to the 
early release of the biologically active material. 

It is another object of the present invention to 
35 provide a transport enhancer which is not subject to 
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the low residency time .of the proteinoids at the 
mucosal surface. 

The present invention is based on the finding 
that compositions containing INV or AIL invasive 
proteins are able to cross the cells of the 
gastrointestinal tract through an internalization and 
transcytosis event. This was a novel observation and 
formed the basis of the current invention concerning 
the delivery of therapeutic agents. 

The present invention provides a delivery system, 
involving a therapeutic agent and an invasion 
proficient bacterial protein which transports the 
therapeutic agent across the gastrointestinal membrane 
barrier, thereby increasing the oral bioavailability 
15 of that agent. The system may optionally include a 
carrier component such as a liposome or polymer-based 
particle. In an alternate embodiment, the 
pharmaceutical composition may involve a fusion 
protein including the therapeutic moiety and an 
invasion proficient bacterial protein to effect 
delivery of the composition across the 
gastrointestinal tract. In yet another embodiment, 
the therapeutic moiety and invasion proficient protein 
may be linked by a degradable peptide sequence. 
25 Tne delivery system of the present invention 

provides a composition that is stable in the gut, 
enhances the uptake of the therapeutic moiety and is 
expected to cross both the enterocytes and the M cells 
of the Peyers patches. The system provides an 
increase in bioavailability as well as a clear 
advantage over existing particle-based systems that 
are dependent on non-specific uptake through the 
antigen-presenting M cells. By increasing the 
bioavailability of intact and active polypeptide and 
35 protein therapeutic agents, the present invention also 
obviates the need for the parenteral administration of 



20 
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such therapeutic agents which are otherwise degraded 
in the gut or relatively unable to cross the 
gastrointestinal barrier. 



DESCRIPTION Of THE DRAWINGS 

Figure 1 illustrates the oligonucleotide and 
amino acid sequences of invasin (INV) protein (SEQ ID 
10 NO:l) . 

Figure 2 illustrates the oligonucleotide and 
amino acid sequences of attachment-invasion-locus 
(AIL) protein (SEQ ID NO:2). 

Figure 3 illustrates the oligonucleotide and 
15 amino acid sequences of maltose binding protein (MBP) 

(SEQ ID NO:3) . 

Figure 4 illustrates the effect of invasin 
transfection and expression on the binding of E. coli 
to the human enterocyte Caco-2 cell line. 
20 Figure 5 illustrates the effect of invasin 

transfection and expression on the internalization of 
E. coli into the human enterocyte Caco-2 cell line. 

Figure 6 illustrates the effect of AIL- 
transfection and expression on the binding of E. cell 
25 to the human enterocyte Caco-2 cell line. 

Figure 7 illustrates the effect of AIL- 
transfection and expression on the internalization of 
E. coli into the human enterocyte Caco-2 cell line. 

Figure 8 summarizes a nine hour study showing the 
30 effect of both INV- and AIL-transfection and 

expression on the internalization of E. coli into the 
non-polarized human enterocyte cell line. 

Figure 9 illustrates the polarity of receptor 
distribution in C*co-2 monolayers grown on Transwell- 
35 COL inserts. The distribution of the fibronectm, 
epidermal growth factor (EGF) , taurocholic acid (TA) 
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and intrinsic factor-vitamin B12 complex (IF-VB12) 
receptors are shown. 

Figure 10 illustrates the surface binding of INV- 
and AIL-transfected E. coli to polarized Caco-2 cell 
5 monolayers. 

Figure 11 illustrates the internalization of INV- 
and AIL-transfected E. coli into polarized Caco-2 cell 
, monolayers . 

Figure 12 illustrates the time course of 
10 trancytosis of INV- and AIL-transfected E. coli across 
the polarized Caco-2 cell monolayers. 

Figure 13 illustrates specificity of the binding 
of radiolabelled MBP-INV to the non-polarized Caco-2 
cell line. 

15 Figure 14 illustrates the amino acid sequence of 

a fusion protein of invasin and maltose binding 
protein (SEQ ID NO:4) using the 192 amino acids from 
the C-terminal end of INV from Y. pseudotuberculosis. 
Figure 15 illustrates the amino acid sequence of 
20 a fusion protein of attachment-invasion-locus protein 
and maltose binding protein (SEQ ID* NO: 5) . 

Figure 16 illustrates the liposome -uptake by 
Caco-2 cells with and without conjugation to MBP-INV* 

25 (Note: All the points shown in the drawings 

represent the mean ±SEM where n=3) 



DSTAILXD DESCRIPTION or TBS INVENTION 

30 

It is known that many bacteria, viruses and cells 
of the immune system are able to permeate the 
epithelial and endothelial barriers of the body 
through the expression of integral or peripheral 
35 membrane proteins . Current investigations of 

bacterial proteins have revealed at least two proteins 
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that appear to be involved in the invasion of bacteria 
into the human host. These invasive proteins have 
been termed invasin (INV) and attachment-invasion- 
locus (AIL) proteins. Both proteins have been cloned 
5 from Yersinia enterocolitica, although INV is also 
known to exist with large homology in Y. pseudo- 
tuberculosis. " * 
The present invention involves the discovery that 
the INV and AIL proteins may be used to mediate the 

10 transport of therapeutic compositions , including large 
particles (approximately 1 Jim) , across the polarized 
human enterocyte, thereby enhancing the penetration or 
passage of a therapeutic composition across the 
gastrointestinal barrier. Moreover, it has been 

15 determined that such invasion proteins can be removed 
from their natural bacterial expression system yet 
retain the ability to bind the human enterocyte. 

These findings lead to the development of the 
present oral delivery system based upon the 

20 combination of a therapeutic agent with the INV or AIL 
protein or derivatives there9f . The bacterial 
invasion proteins bind to receptojrs expressed through 
the apical or luminal domains of the enterocytes or M 
cells of the Peyers Patches. In this way, INV and AIL 

25 act as bioadhesive. agents and thereby increase the 
residence time of the pharmaceutical composition in 
the gut. This in itself can increase the 
bioavailability of the therapeutic agent by promoting 
uptake of the therapeutic agent. It was further 

30 determined, however, that INV and AIL also mediate the 
movement of the composition either paracelluiarly or 
transcellularly across the gastrointestinal tract, and 
thereby facilitate the transport of the therapeutic 
agent across the mucosal barrier. The bacterial 

35 invasion proteins may also be used for increasing drug 
transport through other non-invasive routes where the 
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appropriate receptors are expressed. Such routes may 

include nasal, ocular, rectal, vaginal, pulmonary and 
transdermal routes of administration. 

In one embodiment of the present invention, the 
5 bacterial invasion protein is indirectly associated 
with the therapeutic agent through a linking means 
such as a polymer chain, or directly associated with 
the therapeutic agent by a chemical means. An 
alternative embodiment of the present invention is 
10 based upon the incorporation of a therapeutic agent 
into or onto a carrier that is associated with the 
bacterial invasion protein, such as INV and AIL or 
fragments or derivatives thereof. The bacterial 
invasion protein might be bound to, encapsulated 
15 within, incorporated in the structure of, or merely 
combined with the carrier component. Microparticles 
and liposomes are exemplary of the carrier component 
in such a delivery system. 

20 The terms "therapeutic agent", "pharmaceutical", 

"biologically active material" and "drug" may be used 
interchangeably, and as used herein, preferably 
include proteins, hormones and/or medicinal peptides 
useful for treating a medical or veterinary disorder, 

25 preventing a medical or veterinary disorder, or 

regulating the physiology of a human being or animal. 
Suitable therapeutic agents include cytokines, as well 
as a wide range of cytotoxic drugs, muscle relaxants, 
antihypertensives, analgesics, steroids, vitamins, 

30 sedatives and hypnotics, antibiotics, chemotherapeutic 
agents, prostaglandins and radiopharmaceuticals. 

The terms "transport enhancer", "transporting 
ligand" and "ligand" may be used interchangeably, and 
as used herein, preferably include bacterial protein 

35 molecules which, when conjugated to a therapeutic 

agent, are capable of increasing the delivery of the 
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therapeutic agent across a mucosal membrane such as 
the gastrointestinal barrier. In preferred 
embodiments, "transport enhancer" is intended to 
include invasion proficient bacterial coat proteins, 
5 or fragments or analogs thereof- Such bacterial 
invasion proteins may be isolated from bacterial 
cultures or can be produced by known recombinant or , 
synthetic techniques. Methods of isolating and 
purifying MBP-INV fusion proteins have previously been 
10 described (17, 18), but they have not previously been 
used in the compositions and methods and of the 
present invention . 

In its basic form, the drug delivery system of 

15 the present invention is composed of a transport 
enhancer and the desired therapeutic agent. In an 
alternate form, the drug delivery system includes an 
additional component: a carrier moiety. Thus, the 
pharmaceutical compositions of the present invention 

20 may include a transport enhancer such as a bacterial 
invasion protein. The transport enhancer is 
associated with or attached to 1 a carrier component, 
which in preferred embodiments include latex 
microspheres or liposomes such as those composed of 

25 dipalmitoylphosphatidyl-ethanolamine 

(DPPC) rcholesterol <chol) :N-glutaryl-dioleoyl- 
phosphatidylethanolamine (NG-DOPE) . The therapeutic 
agent can be incorporated into or onto the carrier by 
various methods known in the art or it may be attached 

30 to or associated with the transport enhancer. 

Exemplary transport enhancers include invasion 
proficient bacterial proteins such as INV and AIL. 
Exemplary amino acid and nucleotide sequences of the 
INV and AIL proteins are illustrated in Figures 1 and 

35 2, respectively/ as well as Sequence ID N0s:l and 2. 
INV, an 835 amino acid single chain polypeotide, has 
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been well characterized -i-n-feh-e-art (20) , AIL, a 162 
amino acid single chain polypeptide, has also been 
well characterized in the art (21) . 

The receptor binding region of INV involves the 
5 192 amino acids at the C-terminal end of the protein 
(17) . This region has been shown to retain the 
binding affinity of the bacterial invasion protein, 
and therefore, any sequence containing this region 
would be suitable for use in the present invention. 
10 The receptor binding regions of AIL which are 

necessary or sufficient for binding to the bacterial 
protein receptor would include all or some of the 
regions from the four extracellular loops (22). These 
regions include the following sequences: 

15 

Loop 1 QSHVKENGYTLDNDPK 
Loop 2 HQGYDFFYGSNKFGHGDVD 
20 Loop 3 HGKVKASVFDES I SASKT 

Loop 4 KLDSIKVG 



Invasion proficient bacterial proteins suitable 
25 for use in the present invention may be derived from a 
variety of DNA sequences encoding such proteins. The 
selected DNA sequence may be a nucleic acid molecule 
encoding the invasive protein (e.g., an INV or AIL 
protein including sequences as set forth in Figures 1 
30 and 2) or their complementary strands, naturally 
occurring allelic variants , sequences capable of 
hybridizing to a protein-coding area of such DNA 
sequences under stringent conditions, and sequences 
which, but for degeneration, would hybridize with the 
35 protein-coding area of these defined DNA sequences. 
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Suitable invasion proficient bacterial proteins 
also include' 'derivatives of the amino acid sequences. 
Such derivatives could consist of a truncated form of 
the invasive protein, especially with deletion of the 
5 sequence from the amino terminal end of the INV 
protein as described above. Such small molecule 
derivatives of the bacterial proteins are advantageous 
in that they are less likely to be immunogenic. 



0 sequences encoding the invasion proficient bacterial 
proteins can be made by one skilled in the art using 
known techniques. Modifications of interest in the 
protein sequences may include the replacement, 
insertion or deletion of a selected amino acid 

5 residue. Naturally occurring amino acids may be 
divided into groups based upon common side chain 
properties : 

Hydrophobic: norleucine, Met, Ala, 



Further modifications in the peptides or DNA 



Val, Leu, He 



Neutral hydrophilic: 



Cys, Sej^f Thr 



Acidic : 



Asp, Glu 



Basic : 



Asn, Gin, His, Lys, 
Arg 



Residues that influence 



chain orientation: 



Gly, Pro 



Aromatic : 



Trp, Tyr, Phe 



Nonconservative substitutions will entail exchanging a 
member of one of these classes for another. Other 
exemplary substitutions are illustrated in Table 1. 
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Table 1 

5 Original Exemplary Preferred 

Res i due Substitution substitution 

Ala (A) lie, Leu, Val Val 

Arg (R) Asn, Gin, Lys Lys 

) Asn (N) Arg, Gin, His, Lys Gin 

Asp (B) Glu Glu 

Cys (C) Ser Ser 

Gin (Q) Asn Asn 

Glu (E) Asp Asp 

Gly (G) Pro Pro 

His (H) Arg, Asn, Gin, Lys Arg 

He (I) Ala, Leu, Met, Leu 
Phe, Val, 
norleucine 

Leu (L) Ala, He, Met, He 

Phe, Val, ■ , 
norleucine 

Lys (K) Arg, Asn, Gin Arg 

Met (M) He, Leu, Phe Leu 

Phe (F) Ala, He, Leu, Val Leu 

Pro (P) Gly Gly 

Ser (S) Thr Thr 

Thr (T) Ser Ser 

Trp (W) Tyr Tyr 

Tyr (Y) Phe, Ser, Thr, Trp Phe 

Val (V) Ala, He, Leu, Leu 

Met, Phe, 

norleucine 



Mutagenic techniques for making such replacements, 
insertions or deletions are well known to those 
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skilled in the art (23) Conservative changes of 1 to 
20 amino acids are preferred. Preferred peptides may 
be generated by proteolytic or glycolytic enzymes, or 
by direct chemical synthesis . 
5 The selected bacterial adhesion protein may also 

be modified to facilitate production and handling of 
the composition. For example, the appropriate 
invasion protein or amino acid sequence may be 
produced to include an additional peptide or protein 
10 component, such as the maltose binding protein <MBP) , 
which can enhance the purification of the protein from 
the recombinant expression system. Figure 3 (SEQ ID 
NO: 3) depicts the amino acid (and nucleotide sequences 
of the maltose binding protein. Additions or 
15 substitutions to the INV and AIL amino acid sequences 
may also be used to facilitate the attachment or 
immobilization of the transport enhancer to or on the 
pharmaceutical agent or carrier component of the 
pharmaceutical composition, thereby promoting the 
20 retention of the transport enhancer. This could 
include, for example, the addition of a cysteine 
residue to the N-terminal end of , the sequence to 
facilitate chemical conjugation by 'disulfide bridging, 
using for instance maleimide. Other deletions, 
25 substitutions or additions to the amino acid sequence 
may have the effect of stabilizing the transport 
enhancer in solution or in the gut or in the serum. 

Suitable transport enhancers are selected from 
proteins or polypeptides which demonstrate an 
30 appropriate binding affinity for the receptors found 
in the cells that form the membrane barrier through 
which the pharmaceutical composition is to be 
transported. The amino acid sequences of the INV or 
AIL proteins demonstrate such a binding affinity for 
35 the receptors found in the gut. Preferably, the 

transport enhancer will also have some specificity fo 
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the cell type that is being targeted. The amino acid 
sequences of the INV or AIL proteins demonstrate such 
a specificity for human enterocytes, which is 
advantageous for gastrointestinal delivery. 
5 The novel compositions of the present invention 

can be combined with conventional pharmaceutical ly 
acceptable excipients suitable for the formulation of 
therapeutic compositions. As used herein, the term 
"pharmaceutical^ acceptable excipient" means a non- 
10 toxic, inert solid, semi-solid or liquid component 

included withing the pharmaceutical formulation. Such 
pharmaceutical^ acceptable carriers include, but are 
not limited to, fillers, diluents, encapsulating 
materials, solvents or formulation agents, involved in 
15 facilitating the carrying or delivery of the 

pharmaceutical agent. Some examples of the materials 
that can serve as pharmaceutically acceptable 
excipients include: sugars, such as lactose, glucose 
and sucrose; starches such as corn starch and potato 
20 starch; cellulose and its derivatives such as sodium 
carboxymethyl cellulose, ethyl cellulose and cellulose 
acetate; powdered tragacanth; malt; gelatin; talc; 
excipients such as cocoa butter and suppository waxes; 
oils such as peanut oil, cottonseed oil, saf flower 
25 oil, sesame oil, olive oil, corn oil and soybean oil; 
glycols, such as propylene glycol; polyols such as 
glycerin, sorbitol, mannitol and polyethylene glycol; 
esters such as ethyl oleate and ethyl laurate; agar; 
buffering agents such as magnesium hydroxide and 
30 aluminum hydroxide; alginic acid; pyrogen-free water; 
isotonic saline; Ringer's solution; ethyl alcohol and 
phosphate buffer solutions, as well as other non-toxic 
compatible substances used in pharmaceutical 
lormulations . Wetting agents, emulsifiers and 
35 lubricants such as sodium lauryl sulfate and magnesium 
stearate, as well as coloring agents, ^releasing 
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agents, coating agents, sweetening, flavoring agents, 
preservatives, stabilizers, extenders, antioxidants, 
surfactants, solubilizers, lubricants, suspending 
agents, binders, disintegrating agents, coating 
materials, etc., can also be present in the 
composition, according to the judgement of the 
f ormulator . 

The excipient (s) must be "acceptable" in that the 
materials are compatible with the other components of 
the formulation and are not deleterious to the 
recipient thereof; this includes materials suitable 
for use in contact with the tissues of human beings 
and animals without excessive toxicity, irritation, 
allergic response, or other problems or complications, 
15 commensurate with a reasonable benefit/risk ratio. 
The compositions of the present invention which 
include excipients can be formulated according to 
known methods for the preparation of pharraaceutically 
useful compositions. Suitable methods are described, 
for example, in Remington's Pharmaceutical Sciences 
(19) . The proportional ratio of therapeutic agent to 
excipient will naturally depend on the chemical 
nature, solubility, and stability of the active 
ingredient, as well as the dosage contemplated. 



20 



25 



30 



The carrier component of the pharmaceutical 
compositions of the present invention may include 
polymeric microparticles or nanoparticles of different 
materials and of very different sizes. Such particles 
may have a membrane- walled form, in which the core 
material is concentrated as a reservoir, or a matrix 
form in which core material is uniformly dispersed. A 
variety of suitable materials exist ranging from non- 
degradable polymers, to biodegradable synthetic 
35 polymers, to modified natural products such as gums, 
starches, proteins, fats and waxes (24). The carriers 
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may also include non-toxic, non-therapeutic 
components, such as liposomes, starburst polymers, 
microspheres, microemulsions, nanocapsules or 
macroemulsions to facilitate formulation, delivery, 
5 controlled release or sustained action of the 
therapeutic composition. 

In one embodiment of the present invention, the 
* carrier component of the pharmaceutical composition is 
a liposome. In an alternate embodiment, the carrier 

10 component may be based upon protenoid technology and 
consist of various amino acids (16) . 

Liposomes are most frequently prepared from 
phospholipids, but other molecules of similar 
molecular shape and dimensions and having both a 

15 hydrophobic and a hydrophilic moiety can be used- All 
such suitable liposome-f orming molecules are referred 
to herein as lipids. One or more naturally occurring 
and/or synthetic lipid compounds may be used in the 
preparation of the liposomes. 

20 Liposomes may be anionic, cationic or neutral 

depending upon the choice of the hydrophilic group. 
For instance, when a compound with a phosphate or a 
sulfate group is used, the resulting liposomes will be 
anionic. When amino-containing lipids are used, the 

25 liposomes will have a positive charge, and will be 
cationic liposomes. In addition, the pharmaceutical 
compositions of the present invention may include 
liposome carriers wherein the invasive protein has 
been incorporated into the liposome bilayer. 

30 Representative suitable phospholipids or lipid 

compounds for forming liposomes include, but are not 
limited to, phospholipid-related materials such as 
phosphatidylcholine (lecithin) , lysolecithin, 
lysophosphatidylethanol-amine, phosphatidylserine, 

35 phosphatidylinositol, sphingomyelin, phosphatidyl- 
ethanolamine (cephalin) , cardiolipin, phosphatidic 
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acid, cerebrosides, dicetylphosphate, phosphatidyl- 
choline, and dipalmitoyl-phosphatidylglycerol . 
Additional nonphosphorous-containing lipids include, 
but are not limited to, stearylamine, dodecylamine, 
5 hexadecyl-amine, acetyl palmitate, glycerol 

ricinoleate, hexadecyl sterate, isopropyl myristate, 
amphoteric acrylic polymers, fatty acid, fatty acid / 
amides, cholesterol, cholesterol ester, 
diacylglycerol, diacylglycerolsuccinate, and the like. 

10 In another embodiment of the present invention, 

the therapeutic agent and the transporting ligand 
might be incorporated together through a polymeric 
carrier. For example, the polymeric carrier may be a 
polymer chain. The list of suitable synthetic 

15 polymers includes; poly {ethylene glycol), N-(2- 

hydroxypropyl)methacrylamide and polyvinyl polymers in 
particular. Other potential polymeric carriers are 
polypeptide carriers, such as poly (a amino acids), 
including poly (a-L-lysine) , poly (N^-hyroxypropyl-L- 

20 glutamine) , poly (L-aspartic acid). In addition, 

naturally occurring proteins (albumin, immunoglobulins 
and lectins), and polysaccharides (dextran and charged 
derivatives) can be used as carriers. The therapeutic 
and/or the transporting ligand may be attached to the 

25 polymer chain through various reactive side chains 
that may or may not be degradable in vivo (25) . 

The carrier may be selected or modified to bind 
the transport enhancer and or the therapeutic agent 
either through simple absorption, an ionic interaction 

30 or covalent linking. Preferably, the carrier is also 
able to incorporate large amounts of the therapeutic 
agent in an active form. The carrier component as 
well as the therapeutic agent associated with the 
carrier should be stable in the gut environment, but 

35 the carrier may also be selected or modified to 
release the therapeutic agent once it has been 
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..transported across the mucosal barrier. The release 
of the therapeutic agent may be effectuated by 
degradative means, such as a cleavable bond, or by 
degradation of the carrier component. Examples of 
such release mechanisms may include stabilized Schiff 
base linkages (26), acid-cleavable linkages (27) or 
oligonucleotide sequences cleaved by serum factors 
(28) . 

The compositions of the present invention are 
typically formed by attaching the transport enhancer 
either directly to the therapeutic agent or to a 
carrier system. Because the bacterial adhesion 
proteins described in the present invention bind cell 
receptors, the method of attachment must not prevent 
15 the binding of the bacterial protein to the receptor. 
This can be tested beforehand on in vitro systems 
containing the appropriate receptors, such as membrane 
preparations or cell systems. 

Various conjugation techniques are known in the 
art, and the following conjugation techniques are 
provided by way of illustration., Other conjugation 
techniques can also be used when appropriate as will 
be appreciated by those skilled in the art. Where the 
therapeutic agent is a protein, conjugation may be 
25 carried out using bifunctional reagents which are 
capable of reacting with each of the proteins (i.e., 
the therapeutic protein and the transport enhancer 
protein) thereby forming a bridge between the two 
components. Covalent attachment of the transport 
enhancer to either the therapeutic agent or the 
carrier system, through either the available amine or 
carboxy groups of the transport enhancer, may be 
carried out using suitable conjugation reagents 
including; glutaraldehyde and cystamine and EDAC. 
35 Other known conjugation agents may be used, as long as 
they provide linkage of the transport * actor without 



20 



30 
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denaturing the protein. One preferred method of 
conjugation involves thiolation wherein the transport • 
protein is treated with reagents such as N- 
Succinimidyl 3- U-pyridyldithio) proprionate (SPDP) to 
5 form a disulfide bridge with another sulfhydryl group 
either in the therapeutic agent or on the carrier. 
Spacers might also be used and could include polymer 
chains such as polyethylene glycol, a sugar or a 
peptide sequence. 
10 Alternatively, the transport enhancer could be 

attached through a simple absorption method as 
described in a following Examples. In yet another 
embodiment, the compositions of the present invention 
can be in the form of a fusion protein made by 
15 recombinant DNA techniques. Thus, one of ordinary 

skill can duplicate or mimic bacterial proteins which 
are suitable as transport enhancers. The use of 
recombinant DNA techniques requires knowledge of the 
nucleic acid sequence of the polypeptide or protein 
20 therapeutic agent to be delivered. The nucleic acid 
fragment corresponding to the therapeutic agent is 
linked to a nucleic acid fragment corresponding to the 
chosen transport enhancer, thereby forming a 
recombinant molecule. The recombinant molecule is 
25 then operably linked to an expression vector and 

introduced into a host cell to enable expression of a 
fusion peptide (29) useful as a chimeric molecule in 
the present invention. When the carrier component of 
the pharmaceutical composition is also an amino acid 
30 ■ sequence, for example a polymer chain, the entire 
pharmaceutical composition may be produced by 
recombinant techniques. 

The suitability of the resultant pharmaceutical 
35 composition as an oral or topical dosage form can be 
tested following the protocols set forth in the 
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following Examples. Compositions which are formulated 
based upon the description of the present invention 
will be administered to subjects at a dosage range 
determined by a skilled investigator or attending 
5 physician based upon known and accepted parameters. 
The dosage regimen involved for a particular 
therapeutic agent may be determined empirically, and * 
making such determinations is within the skill in the 
art. Prior to administering the agent, it is 

10 preferable to determine toxicity levels of the 
therapeutic agent (s) so as to avoid deleterious 
effects. Other considerations will include various 
factors which modify the action of drugs, e.g., the 
age, condition, body weight, sex and diet of the 

15 patient, the nature and severity of the condition as 
well as any complicating illness, time of 
administration and other clinical factors. Optimal 
dosages of the drug of interest can be determined by. 
one of ordinary skill in the art using conventional 

20 techniques. As a general rule, the dosage levels will 
correspond to the accepted and established dosage for 
the particular therapeutic agent to be . delivered, 
i.e., the dosage will be adjusted to attain clinical 
equivalence and/or bioequivalence to the parenteral 

25 dosage form of the therapeutic agent, or correspond to 
the dosage that achieves the desired physiological or 
therapeutic response . 
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EXAMPLES 

EXAMPLE 1 

Internalization of INV- and AIL-Transf ected Bacteria 
5 into the Human Enterocyte 

A transfected bacterium which expresses the ^ 
bacterial adhesion protein on its surface effectively 
serves as a model for the immobilization of the 
D proteins on the surface of a carrier. For example, 
the size of a possible microsphere carrier and an E. 
coli bacterium are very similar (approximately l\xm in 

diameter) . Non-transfected E. coll serve as a control 
in the following comparison studies. 

5 To determine if a bacterial coat protein might 

serve as a transport enhancer, it was first resolved 
that the protein was able to mediate the adherence, 
internalization and ultimately transcytosis or 
transport of transfected bacteria across a layer of 

3 polarized human enterocytes. To test this scenario, 
an in vitro model of a cellular layer/barrier was 
established. 

Methods : 

5 

Transection and Maint enance of the Bacteria 

Yersinia enterocolitica (8081c), E. coli PBR 322 
(control plasmid-transfected) and E. coli HB101 
carrying recombinant plasmids with the Y. 

0 enterocolitica invasion genes for INV (E. coli PVM 

101) and AIL (E. coli PVM 102), were grown and stored 
as previously described (7) . The construction of the 
plasmids for the transfection was also performed as 
described in Miller and Falkow (8) . 

5 For the bacteria/cell interaction experiments, Y. 

enterocolitica 8081c was incubated over night in Luria 
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broth (LB) at room temperature. E. coll PBR 322, PVM— - 
101 and PVM 102 were incubated over night in LB, 
containing 100 jig/ml ampicillin, at 37°C. The 

approximate bacterial density was then determined by 
5 measuring the optical density (O.D.) of the bacterial 
suspensions and comparing the measurement to a 
standard curve of O.D. versus bacterial number. v > 

Cell r»lMir f 

10 The Caco-2 cell line (Ciba-Geigy Pharmaceuticals, 

Horsham, Surrey) was used in the transport studies. 
The cells were routinely used between passage numbers 
95-120, maintained at 37°C under 10% C02 in T175 
flasks (Falcon Labware, Bedford, MA) . Culture medium 

15 consisted of Dulbecco's modified Eagle's medium (DMEM) 
containing 10% fetal calf serum, 1% minimum essential 
medium (MEM) non-essential amino acids, 1000 U/ml 
penicillin, 100 jig/ml streptomycin and 0.3 mg/ml 

glutamine. Cell stocks were passaged every five days 
20 by briefly washing (x2) with Dulbecco's phosphate 

buffered saline (PBS) [-Ca 2+ , -Mg 2+ ], and incubating 
for ten minutes at 37°C with 0.05% trypsin and 0.53 mM 
EDTA. Cells were passaged at a ratio of 1:3 and were 
fed every day except for the first day after 
25 passaging. (All solutions were from Gibco, Grand 
Island, NY) . 

NQn-Pnlflrizftri tall Gultura 

Non-polarized cells were grown on plastic culture 

30 dishes. The Caco-2 cells were passaged as described 
above, diluted into culture medium and then counted on 
a Neubauer hemocytometer (American Scientific 
Products, McGaw Park, IL) , to determine cell density. 
Cne milliliter of the cell suspension containing 1.8 x 

35 10 5 cells was pipetted into each well of a 24-well 
culture plate (Falcon Labware, Bedford, MA) . The 
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cells were further incubated for ten days prior to the 
studies . ..^ ...... . 

OAtprminat.ion n* Bacterial Adherence and Invasion 
5 All of the culture medium used in the bacterial 

studies was antibiotic-free. Non-polarized Caco-2 
cells, in 24-well Falcon culture plates or Caco-2 
monolayers, were washed <x2) in antibiotic-free 
culture medium 24 hours prior to the experiment. 

10 After further incubation over night, the cells were 
placed in fresh medium and equilibrated for one hour. 

The non-polarized cell monolayers were routinely 
inoculated with approximately 2.5 x 10 s bacteria per 
well. The cells were assayed for both surface bound 

15 bacteria and invaded/internalized bacteria using known 
methods (30) . 

Results : 

20 Bart-.ftrial At-r arhment and internalization in the NQn- 
PQ 1 ari»» ri Human F.nterocvte 

Figure 4 illustrates the effect of invasin on the 
binding of E. coJi to the non-polarized human 
enterocyte Cacc-2 cell line and shows that the wild 

25 type Yersinia, which would be expressing all of the 
potentially invasive proteins, rapidly adheres to the 
nonpolarized Caco-2 cell layer. The iNV-transfected 
£. coli (closed circles) also demonstrates a rapid 
surface attachment to the human enterocyte cell line. 

30 Levels of surface adhered PVM 101 (INV) are at least 
10-fold greater than that of the Yersinia bacteria 
after nine hours of incubation. The E. coli control 
also shows some adherence to the Caco-2 cells, 
although levels are always 10-fold less than the 

35 Yersinia or PVM 101. E. coli is fcnown to have some 
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adherent capability in the intestine through the 987P 
pilus (31) . 

A major difference occurs in the internalization 
of the bacteria into the non-polarized cell. Figure 5 
5 illustrates the effect of invasin on the 

internalization of E. coli into the human enterocyte 
Caco-2 cell line. This internalization is an 
important prerequisite to transcytosis or delivery 
across the epithelial barrier. Levels of the 
10 internalized Yersinia climb rapidly to reach a plateau 
of 1 x 10 5 CFU/well. Internalized levels of the INV- 
transfected E. coli (closed circles) are much slower 
to increase but reach 1 x 10 3 CFU/well after nine 
hours. This is more than 10-fold greater than the 
15 internalization of non-transfected JET. coli which was 
not greater than 100 CFU/well even after nine hours. 

Very similar binding and internalization 
characteristics are seen for the AIL-transf ected E. 
coli bacterium (PVM 102), see Figure 6 and Figure 7. 
20 Both the levels of the adhered and the internalized 

PVM 102, however, are less than the levels mediated by 
invasin. This could result from the* fact that the AIL 
protein appears to be a later acting protein in the 
invasion event, as compared to the INV protein. The 
25 results demonstrate that both the INV and AIL proteins 
are able to bind the cells through a receptor 
expressed on the surface of the human enterocyte which 
then mediates the uptake of a large bacterial particle 
(approximately 1 nm) into the cell. 

30 

In a separate study the data were reproduced, 
although only the results at the end of the nine hour 
incubation are summarized in Figure 8. Again high 
levels of the E. coli control are found adhering to 
35 the Caco-2 cells but the levels are less than for any 
of the bacterium that express the invasive proteins . 
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The bacteria can be arranged in order of 
internalization competence as derived from Figure 8: 
Yersinia > PVM 101 (INV) > PVM 102 (AIL) > PBR 302 
(non-transfected) . 

5 

flXAMPLS 2 

Receptor-mediated Transcytosis Across the Polarized 
10 Human Enterocyte 

For an efficient drug delivery system that is 
dependent on receptor-mediated uptake of 
pharmaceutical compositions, delivery via transcytosis 

15 is important. Receptor-mediated transcytosis can be 
defined as the trafficking of the ligand and/or the 
receptor from one membrane domain to the other in an 
endosome derived from the plasma membrane. 

The transfected bacteria were, therefore, tested 

20 for their ability to penetrate or pass through the 
Caco-2 monolayers by transcytosis as described in 
Example 1. 

Methods : 

25 

Pn1flri7^ Culture 

Cell culturing was performed substantially in 
accordance with the methods of Example 1, with the 
exception that the invasion and binding studies 

30 requiring polarized Caco-2 cells were performed on 
cells grown on a 25 mm diameter Cyclopore* membrane 
(polyethylene terephtalate) , with a pore size of 
0.45 tun and a density of 1.6 x 10 6 pores/cm* (from 
Falcon Labware) . The cells were seeded at a cell 

35 density of 1.8 x 10 5 cells/cm 2 insert, with 2.5 

ml/domain of culture medium. Cells routinely reached 
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confluency at five days. The cells- were 1 incubated for 
a total of 21 days prior to use. As the cells grow 
and divide, they form a confluent monolayer across the 
insert * Under these conditions, the cells are able to 
5 feed from both sides as they do in vivo. 

To provide for the measurement of bacterial 
passage across the cell monolayers, the Caco-2 cells 

. . were cultured on filter inserts having larger pores. 
Collagen-coated Transwell-COL filter inserts 

10 (nitrocellulose; Costar, Cambridge, MA) were used 

(average pore size of 3.0 Jim and insert diameter of 24 

mm) . Caco-2 cells were plated at a cell density of 
6.6 xlO 4 cells/cm 2 . 

15 Measurement of Monolayer Confluency and Polarity 

Prior to any experiment being conducted on the 
cell monolayers grown on the filter inserts, the 
monolayers were tested for confluency by measuring for 
tight junction formation between cells as determined 

20 by trans-epithelial electrical resistance (TEER) . 
TEER was determined using an EVOM-F .Epithelial 
Voltohmeter (World Precision Instruments, New Haven, 
CT) with STX "chopstick" electrodes. The measured 
resistance was corrected for the area of the filters 

25 and was routinely >1000 ohms. cm 2 . 

The permeability of the monolayers to 
polyethylene glycol (PEG) (M. wt 4000 Da), inulin (M. 
wt. 5,200) and dextran (M. wt 70,000 Da) was routinely 
determined. 14 C-labelled PEG 4000 (1 nmol; 2 x 10 5 

30 disintegrations per minute [dpm] ) , 14 C-inulin (1 nmol; 
3.1 x 10 4 dpm) (both from Amersham, Arlington Heights, 
IL) and 14 C-dextran (1 nmol; 9 x 10 4 dpm; from New 
England Nuclear, Boston, MA) were added to the 
monolayers in culture medium (2.5 ml) for up to 24 

35 hours. Medium (100 ^1) from both the apical and 

basolaterai domains was removed after thorough mixing, 
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aliquoted into XtalScint Ready caps (BecJtman, 
Fullerton, CA) and counted^ in-a-Beckman 6000 
scintillation counter. The amount of 14 C-PEG, 14 C- 
inulin or 11 C dextran that had diffused through the 
5 monolayer was then calculated. Only the monolayers 
which demonstrated a TEER > 1000 ohms. cm 2 and a PEG 
diffusion of <2% in 24 hours were used for both the 
monolayer characterization and for the bacterial 
studies . 

10 

Formation of intrinsic farfcor and vitamin B12 complex 
The method was derived from Gottlieb et al . (32) 
adapted by Allen and Mehlman (33) . The required 
amount of 57 Co-labelled vitamin B12 from Amersham 

15 (CT2; 100-300 }iCi/M-g) was incubated with a 2-fold 

molar excess of porcine intrinsic factor (IF) (Sigma, 
St. Louis, MO). Incubation was in PBS (2 ml) 
containing 1 mM CaCl2/ 0.5 mM MgCl 2 (PBS++) with 0.1% 
bovine serum albumin (BSA) mixing end over end at 4°C 

20 for two hours. An equal volume of freshly prepared 
dextran-coated charcoal, 0.5% charcoal, 0.1% dextran 
in PBS ++ at 4°C, was added, vbrtexed thoroughly and 
incubated for ten minutes at 4°C. ' The charcoal was 
pelleted by centrifugation at 3,000 rpm (1,500 xg) for 

25 15 minutes in an IEC-Centra-8R centrifuge. The 

supernatant containing the IF- 57 Co-Vitamin B12 (IF- 
57 Co-VB12) complex was collected for further binding 
studies. Non-labelled vitamin B12 (VB12) was used in 
place of 57 Co-VBl2 to make the IF-VBI2 complex for a 

30 determination of non-specific binding. 

Rinding Studies 

Studies were performed to determine the polarity 
of receptor distribution of the Caco-2 cells on the 
35 filter inserts. The studies involved the use of the 

complexed IF 57 Co-VBl2 (IF-VB12), l25 I~f ibronectin (FN; 



WO 96/13250 



PCT/US95/13749 



- 33 - 

from ICN, Minneapolis, MN) , 14 C-taurocholic acid (TA) 

(54 mCi/mmol) from Amersham and 125 I-epidermal growth 
factor (EGF) (1354 Ci-mmol) also from Amersham. 
Twenty-four hours prior to the binding studies, the 
5 cells were washed (x3) with binding medium {serum-free 
culture medium with 0.1% BSA, Sigma) and were then 
further incubated overnight. Immediately prior to the / 
experiment, the medium was replaced again with fresh 
binding medium, and the cells were incubated for a 
- 10 further hour at 37°C. The cells were then cooled to 
4°C for 30 minutes , and the appropriate ligand was 
added to either the apical or basolateral domains. 

125 I-FN was added to a final concentration of 
86 pM, and for the determination of non-specific 
15 binding, a 100-fold molar excess of non-labelled 

fibronectin was added. IF- 57 Co-VB12 was present at 
100 pM, again with a 100-fold molar excess of non- 
labelled IF-VB12 for the determination of the non- 
specific binding. 125 I-EGF was present at 80 pM, with 
20 and without a 100-fold molar excess of the non- 
labelled EGF . 14 OTA was present at 400 nM, with and 
without a 100-fold excess of the non-labelled 
taurocholic acid. 

The incubations were all carried out at 4°C for 
25 six hours. To remove the unbound ligand, the cells 
were washed (x3) at 4°C with PBS. For determination 
of the y emitters, the membranes with the cells were 

cut out of the inserts and counted directly in 12 x 75 
nun test tubes in a Cobra 2000 gamma counter (Pacfcard, 
30 Meridan, CT) . Cells incubated with 14 C-TA were 

solubilized in 0.1 N NaOH (1 ml) and then detected 
following the addition of 10 ml of Atomlight (New 
England Nuclear, Cambridge, MA), in a Becfcman 6000 
scintillation counter. 



35 
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Determination Of Bacterial Invasion of the Mon olayer* 
The TEER of each monolayer was checked ■ — 

immediately prior to the bacterial inoculations with 
approximately 10 7 bacteria per filter insert, and 14 C- 
5 PEG 4000 (1 nmol/insert) was also added at this time 
to monitor monolayer leaking throughout the 
experiment. Incubation of the cells with the bacteria , 
was for four hours at 37°C unless otherwise depicted 
in the figure. The polarized monolayers were 
10 routinely evaluated for TEER at each time point, and 
basolateral medium (100 ^1) was removed for the 

determination of 14 OPEG diffusion. 

Adaptations to the protocol of Isberg (30) were 
used for the determination of bacterial invasion on 

15 the polarized cells as follows: at the end of the 

incubation period on the monolayers, the cells on the 
Falcon inserts or on the Transwell-COL inserts, were 
cooled to 4°C before aspirating the medium from both 
domains. Cells were washed with ice-cold PBS (x5) on 

20 either domain, and one milliliter of a 1% Triton X-100 
solution in PBS was added and incubated for five 
minutes at room temperature. Luria broth (1.5 ml) was 
added to the solubilized cells, which were serially 
diluted further in LB and plated onto LB agar plates 

25 with or without ampicillin for E. coli and 

Y. enterocolitica, respectively. Plates were 
incubated over night, and colonies were counted to 
determine the total number of bacteria [colony forming 
Units (CFU) ] , associated with the cells. 

30 Invasion of the bacteria into the cells of the 

monolayer was determined by washing the cells with PBS 
at room temperature, and adding 2.5 ml of medium 
containing gentamicin sulfate (100 jig/ml) to both 

domains. After a further 90 minutes at 37°C and 
35 washing with PBS (x2) , the cells were solubilized and 
analyzed for CFU as described above. 
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Detftrrninar ion of Bacterial Passage Arr oss thp 
Monolayprs 

To study bacterial passage across the monolayer, 
5 the incubations were continued for up to 24 hours. 
To prevent bacterial overgrowth, a ''kill' 7 of the 
apically-located bacteria was performed six hours / 
after bacterial inoculation. Medium in the apical 
domain was aspirated/ and culture medium (2.5 ml), 
10 containing gentamicin sulfate- (50 fig/ml) was added. 

After a further incubation for one hour, the apical 
medium was replaced with culture medium containing 
gentamicin sulfate (1 ^g/ml) and 14 C-PEG (1 nmol) . 
The number of bacteria in the basolateral domain of 

15 the Transwell-COL inserts was determined at various 
times. The filter inserts were removed from the 
wells, transferred to 6-well plates containing pre- 
equilibrated culture medium (2.5 ml) and further 
incubated as required. The medium from the used 

20 plates was analyzed for both 14 OPEG and total number 
of bacteria, by determining CFU on agar plates as 
previously described. 

Results: 

25 The Caco-2 cell line is derived from a human 

colonic tumor and exhibits a morphology consistent 
with that of the gut epithelium (34) . The Caco-2 
cells, therefore, provide a generally accepted model 
for the human enterocyte (35-38) . The cells can be 

30 grown as a confluent monolayer on plastic cultureware, 
but under these conditions they are not polarized, 
i.e., do not have sorted and differentiated domains. 
Any receptors expressed by the cells, therefore, are 
distributed over the entire surface of the cell. 

35 Alternatively, the cells may be grown as a polarized 
epithelial-like monolayer on a micropo~Dus membrane. 



WO 96/13250 



PCTAJS95/13749 



Under these conditions the various receptors are 
sorted between the two membrane domains, and the cells 
are a true in vitro model of the epithelial lining of 
the human gut . 
5 The monolayer has tight junctions between the 

cells which makes the cell monolayer highly 
impermeable to most molecules having a molecular 
weight >500 Da (38) . The tight junctions separate the 
apical (lumenal) and basolateral (serosal) domains of 

10 the cells (39) . In addition, the membrane in each 

domain is sorted or specific to that domain, such that 
the receptor population (40) and even the lipids are 
different in the two domains (41, 42) . 

The electrical resistance and impermeability of 

15 the monolayers is shown in Table 2. After just 12 

days in culture, the cells formed confluent monolayers 
with tight junctions, as demonstrated by the 
electrical resistance. The electrical resistance does 
increase somewhat after a further seven days in 

20 culture, up to 821 Q.cm 2 . 
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Table 2 

Polarity of the Caco-2 Monolayers 



5 


Parameter 


Measurement 


10 


TEER / 12 dav^ in 
culture 

TEER / 19 days in 
culture 


/ jj .x x 1 1 . h *s • cm 
ozi . d x /b.b w . cm** 






(cm/min) 




14 C-PEG diffusion 
Blank 


6.7 x 10~ 4 ± 1.16 x 10" 5 


15 


+ Cells 


4.8 x 10 5 I 5.76 x 10 0 




14 C-inulin diffusion 
Blank 


3.04 X 10" 4 ± 8.6 x 10" 6 




+ Cells 


1.97 x 10" 6 ± 1.68 x 10" 6 


20 


14 C~dextran diffusion 
Blank 


5.52 x 10-* ± 3.32 x 10~ 5 




+Cells 


3.86 x 10" 6 ± 3.0 x 10" 6 

1 




The monolayers were 


most permeable to the 4 000 



molecular weight PEG (see Table 2) with a permeability 
coefficient of 4.8 x 10" 5 cm/min. The cells were 
highly impermeable to a 14 C-labelled dextran with a 
molecular weight of 70,000 Da (a permeability 
coefficient of 3.86 x 10** cm/min) . With the Caco-2 
cells being impermeable to relatively small molecules, 
one would expect that they would be impenetrable by 
relatively large particles such as bacteria. 

The polarity of the monolayers used in the 
studies of invasion proficient bacterial proteins is 
depicted in Figure 9. The data demonstrate that the 
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receptor population is sorted according to apical and 
basolateral membrane domains. 

The fibronectin receptor (FN-R) is only found on 
the basolateral domain. This might be expected of a 
5 receptor whose major role is to bind the cell to the 
extracellular matrix (43) . This is of concern, 
however, since the FN-R is a pi integrin receptor, 

similar to the receptor for the INV protein (4) . The 
epidermal growth factor receptor (EGF-R) is also found 

10 predominantly on the basolateral domain (>70%) . This 
is a reasonable outcome because the source of EGF in 
vivo would be from the blood. Similar results with 
the EGF-R on polarized Caco-2 cells have been 
demonstrated previously (44) . 

15 Two other receptor populations that are normally 

found on the apical or lumenal side of the gut were 
also characterized. These were the taurocholic acid 
receptor (TA-R) (45) and the intrinsic factor receptor 
(IF-R) (4 6) . IF-R is responsible for the active 

20 uptake of vitamin B12 (VB12) . Both of these receptors 
were found predominantly on the apical domain in the 
in vitro model of the polarized human enterocyte. 
These data agree with previous studies of the polarity 
of brush border enzymes shown in Caco~2 cells (47) . 

25 The data suggest a high degree of polarity of the 

Caco-2 monolayers on the culture inserts. The cells 
form an impermeable barrier to most molecules and, 
therefore, provide a good model for the human gut. 
Studies to identify invasion proficient bacterial 

30 proteins, such as INV and AIL, with this model are 
reflective of the results one might expect in the 
human gut . 
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Bacterial Att-arbmpnt anH Tnfprnal izatinn nf fhfi 
Polarized Human EnfornryfP 

As previously discussed, the polarized in vitro 
model is known to be comparable to the in vivo 
5 situation as shown by receptor distribution. After 
bacterial inoculation and as with the non-polarized 
cells shown for Figures 4-8, relatively high numbers ' 
of the non-transfected £. coli were seen adhered to 
the polarized cells, see Figure 10. Again, this may 
10 result -from some inherent property of the E. coli, 
specifically the 987P pilus (31) , 

The major effect of the invasive proteins lies in 
the internalization of the respective bacteria see 
Figure 11. For the INV- and AIL-transfected E. coli, 
15 internalized levels of the bacteria were 100-fold and 
50-fold greater, respectively, than the non- 
transfected E. coli. In this particular study, the 
levels of the internalized transfected bacteria were 
very comparable to those found with the wild-type 
20 Yersinia bacteria. 

The data suggest that the receptors for both the 
INV and AIL proteins are available on the apical 
domain of the polarized human enterocyte. This was 
reassuring following the fibronectin receptor findings 
25 (in Figure 9) which suggested that this group of 

receptors would not be available for binding. After 
the binding event has occurred through the apical 
domain, the bacteria are internalized into the cells. 

30 Bacterial Passage Across the Polarized Human 
Ente roc yt.fi 

The time course of the trancytosis of the 
bacteria is shown in Figure 12. The levels of the 
ba^olateral-located non-transfected E. coli control 
35 remained flat throughout the 12 hour study, and were 
very low. But, both the INV- and AIL-tr-nsfected 
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E. coli are taken up and transcytosed at levels 

greater than the wild type Yersinia, and for the AIL 

protein the increase is greater than 10-fold. In 
general/ it was found that the AIL protein seemed to 
5 mediate the internalization and transcytosis event far 
more efficiently in polarized human enterocyte Caco-2 
cells as compared to non-polarized cells, t . ^ 

The transcytosis mediated by both INV and AIL is 
quite rapid, but certainly not as quick as the 

10 adhesion event. Therefore, any slowness on the part 

of the proteins to mediate uptake of a particle system 
will not be detrimental to the system if they also 
significantly increase the residence time of the 
protein at the site of uptake through the binding 

15 event . 

The integrity of the cell monolayer was 
maintained throughout this study by killing the 
bacteria in the apical domain were killed after six 

20 hours of incubation. Therefore, the bacteria in the 
basolateral domain represent the bacteria that had 
been bound and internalized into the enterocytes after 
the initial six hours of incubation . It should also 
be noted that the bacteria will continue to divide 

25 both inside the cells and after they have crossed the 
monolayers, and this should be remembered when looking 
at the total number of bacteria. 

To determine the route that the bacteria take 
across the cell layer, the integrity of the monolayer 

30 was checked at the end of every study. 14 C-PEG (4000 
Da) diffusion was measured as a marker for tight 
junction integrity between the cells. It was found 
that the level of PEG diffusion during the 24 hour 
incubation with the bacteria did not increase over 

35 non- inoculated monolayers. This suggests that the 

bacteria do not cross the monolayers through the tight 
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junctions nor through- a -degradation of monolayer 
integrity. The data suggest that the INV- and AIL- 
transfected bacteria are able to cross the cells 
through an internalization and transcytosis event. 
5 The finding that the particles crossed the membrane 
barrier was a novel observation and formed the basis 
of the current invention. 

It has been generally accepted that Yersinia 
enterocolitica, which expresses both INV and AIL, 
10 enters the body from the gut through the M cells of 
the Peyers Patches, (9, 10) . This would not be a 
preferred route for therapeutic delivery. The M cells 
are the most efficient way to deliver an antigen to 
the immune system from the gut, and therefore, this 
15 route increases the chance of eliciting an immune 

response to the therapeutic agent. The present data, 
with the human enterocyte Caco-2 cell line, suggested 
that a drug delivery system based on INV- or AIL- 
mediated uptake would also transport a therapeutic 
20 agent across the enterocytes, and thereby allow the 
pharmaceutical composition to reach the systemic 
circulation. This would increase the potential 
capacity of the delivery system and decrease or 
prevent the possible immunologic presentation of the 
25 therapeutic agent. 

SXftMFLE 3 

Expression, purification and testing of the MBP-INV 
30 and MBP-AIL fusion proteins 

Preparation rind purification of bacterial protein 

Nucleic acid sequences encoding either the INV or 
AIL protein, in combination with MBP, were transfected 
35 into JET. coli using known techniques (18) . The 

expressed protein was extracted from the transfected 
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bacteria by two passes in a French pressure cell at 

14 , 000 p..s.-i^ -TJae. method for the purification of the 

MBP-INV and MBP-AIL was performed as described by 
Leong et al. (17) using affinity chromatography with 
5 cross-linked amylose (18)). 

The amino acid sequence for MBP is illustrated in 
Figure 3 and SEQ ID NO: 3. The amino acid sequence for 
an exemplary MBP-INV fusion protein is illustrated in 
Figure 14 and SEQ ID NO:4. The amino acid sequence 
10 for an exemplary MBP-AIL fusion protein is illustrated 
in Figure 15 and SEQ ID NO:5. 

m vitro Assa ying of the Fusion Proteins 

After purification, the proteins were stored at 
15 -80°C, in 10 mM Tris buffer pH 8.0, with 100 mM NaCl 
and 1 mM EGTA. Assays were established to demonstrate 
that the proteins were able to bind to the appropriate 
receptor on the human enterocyte Caco-2 cell after 
labelling and immobilization of the MBP-INV protein. 

20 

r^hiot filing of Bacterial Coat Protfi i ns and MBP- 

fn^ion Protein 

Proteins were diluted to a condertrrvtion of 500 
Hg/ml in iodination buffer (100 mM NaH2P04, pH 6.5) 

25 and were then microdialyzed over night in iodination 
buffer. Two Iodobeads (Pierce Chemicals, Rockford, 
IL) were used per protein and these were prewashed 
(x2) in iodination buffer, blotted dry and placed in 
borosilicate tubes. Iodination buffer (100 was 

30 added to the beads together with 10 *il of Na 12 $I 

(carrier free, specific activity 100 mCi/ml, from New 
England Nuclear) . After reacting for five minutes, 
the protein was added to provide 200 ^g/tube. The 
reaction mixture was mixed, allowed to react for five 

35 minutes at room temperature, and was then removed from 
the Iodobeads. Ten microliters of 1M parahydroxy- 
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benzoate was added to bind any non-labelling 125 I , . -and^™* - 
the mixture was incubated for a further ten minutes on 
ice. Separation of the 125 I-labelled protein and the 
unbound 125 I was carried out on a PD10 desalting column 
5 (Pharmacia, Piscataway, NJ) which had been pre- 

equilibrated with PBS. Fractions eluted with PBS (500 
^il) were collected and assessed for radioactivity in a 

Cobra 5000 gamma counter (Packard, Downers Grove, IL) . 
The fractions containing the labelled protein 

10 were pooled and then exhaustively dialyzed at 4°C in 
PBS with 0.02% Tween 20. The dialysate was 
continually monitored for 125 I, until no further non- 
labelling 125 I was removed. The amount of unbound 125 i 
present with the radiolabelled protein was determined 

15 by precipitation with a final 6% solution of 

trichloroacetic acid (TCA) . The amount of protein was 
determined using the BCA protein assay (Pierce 
Chemicals, RocJcford, IL) . The final yield of MBP-INV 
after radiolabelling was 29%. The amount of unbound 

20 125 I was 1.5% and the specific activity of the 
radiolabelled MBP-INV was 3.23 x 10 6 cpm/jig. 

Binding Assay 

A conventional binding assay was performed using 

25 125 I-labelled MBP-INV, and the specificity of the cell 
binding with this protein was determined by competing 
with non-labelled MBP-INV, MBP-AIL and the MBP protein 
alone. 125 I-MBP-INV was added to each well of a 
24-well plate containing a confluent monolayer of the 

30 Caco-2 cells. The final concentration of the protein 
was 100 ng/ml (833 pM) and 3.2 x 10 5 cpm/ml. A 
100-fold excess of each competing protein was added as 
required. The cells were incubated with the proteins 
for two hours at 37°C under 10% CC2 in DMEM with 10% 

35 fetal bovine serum (FBS) . After cooling the cells to 
4°C for 30 minutes, the cells were washed (x3) with 
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PBS containing 0.1% BSA and solubilized in 0.1N NaOH 
before counting in the Cobra 6000 gamma counter. 

Results 

5 The results are summarized in Figure 13. The 

binding of 125 I-labelled MBP-INV was inhibited by more 
than 70% by the non-labelled MBP-INV, whereas the MBP- 
AIL protein did not appear to inhibit binding. The 
control protein MBP, did appear to cause some 
inhibition of the MBP-INV binding (27%). The results, 
however, indicate that the INV protein binds the Caco- 
2 cells through a receptor-specific mechanism. More 
importantly, the isolated form of the protein retained 
its binding ability and, therefore, provided a 
suitable invasion proficient bacterial protein for use 
in the pharmaceutical compositions of the present 
invention. 



EXAMPLE 4 

INV and AIL Proteins with Carrier Component 

One embodiment of the pharmaceutical composition 
of the present invention involves a 
therapeutic/carrier combination whose uptake is 
mediated by a transport enhancer, such as the INV or 
AIL proteins. The MBP-INV protein was associated with 
f luorescently labelled microspheres and liposomes to 
evaluate such a delivery system. 

Methods : 

Coating of T.atpy Microspheres with Bacterial Proteins 
Latex microspheres, labelled with a fluorescent 
dye (phycoerythrin, PC) and having an average diameter 
of 0.996 |im, were obtained from Polysciences, 
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Warrington^ PA. The PC-labelled microspheres (2.27 x 
10 10 ) were washed (x4) with a Q.1M borate buffer 
pH 8.5. After each wash, the microspheres were 
collected by centrifugation at 8,000 rpm for six 
5 minutes in an Eppendorf centrifuge. 

The latex microspheres were coated with the 
bacterial coat protein by simple adsorption. The 
microspheres were resuspended in 300 microliters of 
10 mM Tris buffer (pH 8.0) containing 100 mM Nad, 
10 1 mM EGTA and 400 \ig of the MBP-INV protein. A 

further one milliliter of the borate buffer was then 
added . 

To remove the free or uncoated protein, the 
microspheres were again centrifuged at 11,000 rpm for 

15 ten minutes in the Eppendorf centrifuge, and the 

supernatant was collected for protein determination in 
the BCA assay. It was usual that no free protein was 
found remaining in the supernatant, i.e., all the 
protein was coating the microspheres. The coated 

20 microspheres were subsequently resuspended in the 

borate buffer (1 ml) with 10 mg/ml BSA, incubated for 
30 minutes at room temperature, arid then collected by 

r 

centrifugation. The microspheres were washed (x2) 
with the borate buffer/BSA (1 ml) before being finally 
25 resuspended in PBS (1 ml) containing 10 mg/ml of BSA, 
0.1% NaN3 and 5% glycerol. The microspheres were then 
stored at 4°C. 

Adherence Cf the TNV-Coated Microspheres tn Cultured 

30 cells 

Two cell lines were used to evaluate the 
adherence of the bacterial protein/microsphere 
compositions: the HEp-G2 cell line, (from a human 
hepatocellular carcinoma cell line from ATCC #HB-8065) 
35 and the Caco-2 cell line. The HEp-G2 cell line is 
epithelial in morphology and is routinely used as an 
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in vitro cell model of the liver hepatocyte. The 
- w c"e'lls were plated onto glass coverslips (Baxter, McGaw 
Park, IL) at a cell density of 1 x 10 5 cells/cm 2 in a 
6-well Costar culture plate. The cells were incubated 
5 for two days in Dulbecco's minimum essential medium, 
with 5% FBS and 0.1% non-essential amino acids (all 
from Gibco), at 37°C and 5% C02 . DMEM (2 ml) was 
added to the wells with INV-coated POmicrospheres (2 
x 10 8 ) . Control wells were established using uncoated 

10 POmicrospheres (2 x 10 8 ) . The cells were further 
incubated on a rocker at 37°C for two hours before 
cooling to 4°C and washing <x3) with ice-cold PBS (2 
ml) . The coverslips were then viewed under a Nikon 
Optiphot-2 microscope with fluorescence adaptation, 

15 and photographs were taken using a Nikon Fx-35WA 
camera . 

roniucration o f MBP-invasin to liposomes 

The liposomes were composed of dipalmitoyl- 

20 phosphatidylcholine (DPPC) rcholesterol (chol) :N- 

glutaryl-dioleoylphosphatidylethanolamine (NG-DOPE) 
were prepared by sonication. Solvent free lipid films 
were prepared at a mole ratio of DPPC : cho 1 : NG-DOPE of 
2:1:0.1 and contained a trace amount of [ 3 H]- 

25 cholesteryl hexadecyl ether (CE) as a marker for total 
lipid. The lipid films were hydrated in Mes-acetate 
saline buffer (20 mM Mes, 20 mM NaAcetate, pH 5.5, 
0.15 M NaCl) and sonicated to form small unilamellar 
liposomes. To 0.2 ^mol total lipid was added 0.4 mg 

30 of l-ethyl-3- (3-dimethylaminopropyl) carbodiimide 
hydrochloride (EDO and 0.2 mg of N- 
hydroxysulfosuccinimide (S-NHS) , and the samples were 
mixed for 15 minutes at room temperature. MBP-invasin 
(0.2 mg) was added and the pH of the. suspension 

35 adjusted to 8.0 using a small aliquot of 0.4 M NaHC03 
buffer. The sample was then stirred overnight at 4°C. 
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Unconjugated MBP-invasin was removed from liposomes by 
centrifuging the samples for 10 minutes at 100,000 x g 
in an air driven ultracentrifuge . Pelleted liposomes 
were resuspended in PBS, pH 7.0 and centrifuged twice 
5 more to remove unconjugated protein. The conjugated 
MBP-invasin was determined using the BCA assay, and 
lipid recovery was quantitated by scintillation * / 
counting. The final MBP-invasin : total lipid ratio was 
between 60 and 100 jig/junol lipid. 

10 

Uptake of liposomes bv Caco-2 cells 

Dilutions of unconjugated liposomes and MBP- 
invasin conjugated liposomes were made in RPMI medium 
(Gibco) and incubated with confluent monolayers of 

15 non-polarized Caco-2 cells grown in a 24 well plate 
for one hour at 37°C. The cells were washed three 
times with RPMI medium and dissolved by adding 0.1 N 
NaOH (1 ml) to each well. Dissolved cells (100^.1) 
were used to quantitate cellular protein, while 900 pi 

20 of the samples were processed for scintillation 
counting and lipid quantitation. 

i 

Results 

A highly visible difference in the adherence of 
25 the coated microspheres vs. non-coated microspheres 
was found on the cells on the coverslips, i.e., the 
coated microspheres became adherent to the human 
enterocyte. The effect was observed on both HEp-G2 
cells and on the human enterocyte cell line Caco-2. 
30 The non-coated microspheres, however, showed no 
visible adherence to the Caco-2 cells . 

The data for the MBP-INV-con jugated liposomes are 
presented in Figure 16. The results demonstrate an 
uptake of 5.6-fold greater levels of the MBP-INV- 
35 conjugated liposomes over the non-conjugated liposomes 
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(1.47 nmol/well vs. 0.265 nmol/well) . The amount of 
lipid uptake was found. £o.Jpje .concentration dependent. 
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The foregoing descriptions of the specific 
embodiments fully reveal the general nature and 
applicability of the present invention such that 
others can readily adapt and/or optimize the teachings 

30 and specific embodiments to produce an assortment of 
pharmaceutical compositions using a variety of 
therapeutic agents, carrier components and invasive 
protein transport enhancers. Any such modifications 
and adaptations are intended to be embraced within the 

35 meaning and range of equivalents of the disclosed 
embodiments. It is also to be understood that the 
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phraseology and terminology employed herein are for 
the purpose of description and not of limitation. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Amgen Inc. 

(ii) TITLE OF INVENTION : COMPOSITIONS FOR INCREASED 

BIOAVAILABILITY OF ORALLY DELIVERED THERAPEUTIC AGENTS 

(iii) NUMBER OF SEQUENCES: 5 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Amgen Inc. 

(B) STREET: 1840 Dehavilland Drive 

(C) CITY: Thousand Oaks 

(D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP: 91320-1789 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3600 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 413.. 2920 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
GAGTCGTACT GTGGGGAAAA CCGGCGAGAG CGAAGCGGCG GTCCATATAC CCTCCTTAAC 
TAAGCCAGCG GTTGCTTAGT CGCATTAGAT TAATGCATCG TGAAATGCAG AGAGTCTATT 
TTATGAGACG AATGTAAACT ATTTTGATAA TAATAATATA TCACAATATA TATATACATG 
CTAAATATAA CCTGACAATT AAATTAACAA GCTAATATTA CCATGATGAT TTTTTTTTTT 
TGCATTTCAT TTGTCATTGC TGTTATTTTT AATTTTTTAA TTTTATTTTT GTAAGTTCTG 
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CTATTCTATT GTTAGTGTTT GCGAGAGAGA AGAAGTTATT TCTTGTCGCT GTTTTCATTT 3 60 

CTGTTGCTTA AGTAAATATT ACCGCGTTAA TTTATACCTA AGGGGTACAC TA ATG 415 

Met 
1 

TAT TCA TTT TTT AAT ACG CTA ACT GTG ACT AAA ATC ATT AGC AGG CTA 463 
Tyr Ser Phe Phe Asn Thr Leu Thr Val Thr Lys lie lie Ser Arg Leu 
5 10 15 

ATA TTA TCG ATC GGT TTA ATA TTT GGA ATA TTT ACT TAT GGG TTC TCA * 511 
lie Leu Ser He Gly Leu He Phe Gly He Phe Thr Tyr Gly Phe Ser 
20 25 30 

CAG CAA CAT TAT TTT AAT TCA GAA GCG TTA GAG AAC CCC GCT GAA CAT 559 
Gin Gin His Tyr Phe Asn Ser Glu Ala Leu Glu Asn Pro Ala Glu His 
35 40 45 

AAT GAG GCT TTC AAT AAG ATA ATC AGT ACC GGA ACC AGT CTG GCG GTA 607 
Asn Glu Ala Phe Asn Lys He He Ser Thr Gly Thr Ser Leu Ala Val 
50 55 60 65 

TCG GGT AAT GCA TCC AAT ATC ACC AGG TCA ATG GTA AAT GAC GCG GCA 655 
Ser Gly Asn Ala Ser Asn He Thr Arg Ser Met Val Asn Asp Ala Ala 
70 75 80 

AAT CAG GAA GTA AAA CAC TGG TTA AAT AGA TTT GGG AC A ACT CAG GTC 7 03 

Asn Gin Glu Val Lys His Trp Leu Asn Arg Phe Gly Thr Thr Gin Val 
85 90 95 

AAT GTT AAC TTT GAT AAA AAG TTC TCC CTC AAA GAA AGT TCT CTT GAT 7 51 

Asn Val Asn Phe Asp Lys Lys Phe Ser Leu Lys Glu Ser Ser Leu Asp 
100 105 110 

TGG CTG TTG CCT TGG TAT GAC TCT GCT TCA TAT GTC TTT TTT AGT CAG 79 9 

Trp Leu Leu Pro Trp Tyr Asp Ser Ala Ser Tyr Val Phe« Phe Ser Gin 
115 120 125 

TTG GGT ATA AGA AAT AAA GAC AGT CGC AAT ACC CTT AAT ATC GGC GCT 847 
Leu Gly He Arg Asn Lys Asp Ser Arg Asn Thr Leu Asn He Gly Ala 
130 135 140 145 

GGG GTG CGT ACC TTC CAA CAA AGT TGG ATG TAT GGC TTT AAC ACT TCC 895 
Gly Val Arg Thr Phe Gin Gin Ser Trp Met Tyr Gly Phe Asn Thr Ser 
150 155 160 

TAT GAC AAT GAT ATG ACT GGG CAC AAT CAT CGT ATT GGC GTG GGT GCA 94 3 

Tyr Asp Asn Asp Met Thr Gly His Asn His Arg He Gly Val Gly Ala 
165 170 175 

GAA GCC TGG ACT GAT TAT TTA CAA TTA TCG GCC AAT GGT TAT TTT CGC 991 
Glu Ala Trp Thr Asp Tyr Leu Gin Leu Ser Ala Asn Gly Tyr Phe Arg 
180 185 190 

CTC AAT GGT TGG CAT CAA TCT CGT GAT TTC GCG GAC TAT AAT GAG CGC 103 9 

Leu Asn Gly Trp His Gin Ser Arg Asp Phe Ala Asp Tyr Asn Glu Arg 
195 200 205 
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CCG GCA AGC GGG GGC GAC ATT CAC GTC AAA GCG TAT TTA CCT GCG CTG 1087 
Pro Ala Ser Gly Gly Asp lie His Val Lys Ala Tyr Leu Pro Ala Leu 
210 215 220 _ 225 

CCA CAA TTG GGC GGG AAA TTA AAA TAT GAG CAG TAC CGT GGT GAG CGG 113 5 

Pro Gin Leu Gly Gly Lys Leu Lys Tyr Glu Gin Tyr Arg Gly Glu Arg 
230 235 240 

GTG GCT TTA TTT GGT AAA GAT AAC CTG CAA AGT AAC CCT TAT GCG GTG 118 3 

Val Ala Leu Phe Gly Lys Asp Asn Leu Gin Ser Asn Pro Tyr Ala Val 
245 250 255 

/ 

ACC ACA GGG CTT ATT TAT ACG CCG ATC CCC TTC ATT ACA CTG GGG GTC 1231 
Thr Thr Gly Leu lie Tyr Thr Pro lie Pro Phe lie Thr Leu Gly Val 
260 265 270 

GAT CAA CGA ATG GGA AAA AGT CGG CAG CAT GAA ATA CAA TGG AAC TTA 1279 
Asp Gin Arg Met Gly Lys Ser Arg Gin His Glu lie Gin Trp Asn Leu 
275 280 285 

CAA ATG GAT TAT CGC CTC GGC GAA AGT TTT CGT TCG CAG TTT AGC CCC 1327 
Gin Met Asp Tyr Arg Leu Gly Glu Ser Phe Arg Ser Gin Phe Ser Pro 
290 295 300 305 

GCA GTG GTG GCC GGA ACT CGT TTA CTG GCT GAG AGC CGT TAT AAT CTG 1375 
Ala Val Val Ala Gly Thr Arg Leu Leu Ala Glu Ser Arg Tyr Asn Leu 
310 315 320 

GTT GAG CGC AAT CCA AAT ATT GTT CTG GAA TAC CAA AAA CAG AAT ACT 142 3 

Val Glu Arg Asn Pro Asn He Val Leu Glu Tyr Gin Lys Gin Asn Thr 
325 330 335 

ATC AAA TTG GCA TTT TCA CCC GCC GTA CTC TCC GGC CTG CCG GGG CAG 1471 
He Lys Leu Ala Phe Ser Pro Ala Val Leu Ser Gly Leu Pro Gly Gin 
340 345 350 

GTT TAT TCC GTT AGT GCA CAA ATA CAG TCT CAA TCT GCA CTA CAA CGT 1519 
Val Tyr Ser Val Ser Ala Gin He Gin Ser Gin Ser Ala Leu Gin Arg 
355 360 365 

ATT CTC TGG AAT GAT GCG CAA TGG GTT GCT GCC GGC GGC AAA TTA ATA 1567 
He Leu Trp Asn Asp Ala Gin Trp Val Ala Ala Gly Gly Lys Leu He 
370 375 380 385 

CCC GTC AGT GCA ACA GAT TAC AAT GTG GTC TTA CCG CCT TAT AAA CCG 1615 
Pro Val Ser Ala Thr Asp Tyr Asn Val Val Leu Pro Pro Tyr Lys Pro 
390 395 400 

ATG GCA CCA GCG AGT CGT ACT GTG GGG AAA ACC GGC GAG AGC GAA GCG 1663 
Met Ala Pro Ala Ser Arg Thr Val Gly Lys Thr Gly Glu Ser Giu Ala 
405 410 415 

GCG GTC AAT ACC TAT ACC CTC AGC GCC ACG GCT ATC GAT AAC CAC GGC 1711 
Ala Val Asn Thr Tyr Thr Leu Ser Ala Thr Ala He Asp Asn His Gly 
420 425 430 

AAT AGT TCT AAT CCA GCT ACG TTG ACC GTT ATT GTG CAG CAA CCT CAG 17 59 

Asn Ser Ser Asn Pro Ala Thr Leu Thr Val He Val Gin Gin Pro Gin 
435 440 445 
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TTC GTT ATT ACC TCG GAA GTG ACT GAT GAT GGT GCG CTT GCT GAT GGC 1807 
Phe Val He Thr Ser Glu Val Thr Asp Asp Gly Ala Leu Ala Asp Gly 
450 455 460 465 

AGG ACT CCC ATC ACG GTG AAA TTT ACA GTG ACT AAT ATT GAT AGT ACG ISoS 
Arg Thr Pro lie Thr Val Lys Phe Thr Val Thr Asn He Asp Ser Thr 
470 476 480 

CCG GTT GCC GAG CAA GAG GGG GTG ATA ACC ACC AGT AAT GGT GCG CTT 1903 
Pro Val Ala Glu Gin Glu Gly Val He Thr Thr Ser Asn Gly Ala Leu 4 
485 490 495 

'CCC AGT AAA GTC ACA AAA AAA ACC GAT GCA CAG GGT GTG ATA AGC ATT 1951 
Pro Ser Lys Val Thr Lys Lys Thr Asp Ala Gin Gly Val He Ser He 
500 505 510 

GCA TTA ACT AGC TTC ACT GTT GGG GTG TCA GTC GTC ACT TTA GAT ATT 1999 
Ala Leu Thr Ser Phe Thr Val Gly Val Ser Val Val Thr Leu Asp He 
515 520 525 

CAG GGG CAA CAG GCT ACT GTT GAT GTA CGA TTT GCC GTG CTG CCG CCA 2047 
Gin Gly Gin Gin Ala Thr Val Asp Val Arg Phe Ala Val Leu Pro Pro 
530 535 540 545 

GAT GTC ACA AAC TCA AGT TTT AAC GTT TCT CCA TCT GAT ATT GTT GCC 2095 
Asp Val Thr Asn Ser Ser Phe Asn Val Ser Pro Ser Asp He Val Ala 
550 555 560 

GAT GGC TCC ATG CAG TCG ATA CTC ACC TTT GTT CCG CGT AAT AAA AAT 2143 
Asp Gly Ser Met Gin Ser He Leu Thr Phe Val Pro Arg Asn Lys Asn 
565 570 575 

AAT GAG TTT GTC AGT GGG ATA ACA GAT CTT GAA TTT ATA CAA AGT GGT 2191 
Asn Glu Phe Val Ser Gly He Thr Asp Leu Glu Bhe He Gin Ser Gly 
580 585 ,590 

GTT CCG GTA ACT ATT AGT CCG GTA ACC GAA AAT GCT GAC AAC TAT ACC 2239 
Val Pro Val Thr He Ser Pro Val Thr Glu Asn Ala Asp Asn Tyr Thr 
595 600 605 

GCC AGT GTG GTG GGA AAT TCG GTA GGA GAT GTC GAT ATT ACG CCG CAG 2287 
Ala Ser Val Val Gly Asn Ser Val Gly Asp Val Asp He Thr Pro Gin 
610 615 620 625 

GTG GGG GGG GAA TCA CTA GAC TTG TTG CAG AAA AGA ATC ACC CTG TAC 233 5 

Val Gly Gly Glu Ser Leu Asp Leu Leu Gin Lys Arg He Thr Leu Tyr 
630 635 640 

CCA GTA CCG AAG ATA ACC GGC ATT AAC GTG AAT GGT GAG CAA TTT GCC 23 83 

Pro Val Pro Lys lie Thr Gly He Asn Val Asn Gly Glu Gin Phe Ala 
645 650 655 

ACA GAT AAA GGC TTC CCG AAA ACT ACC TTT AAT AAA GCC ACG TTC CAA 2431 
Thr Asp Lys Gly Phe Pro Lys Thr Thr - Phe Asn Lys Ala Thr Phe Gin 
660 665 670 
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TTG GTG ATG AAT GAC GAT GTG GCG AAT AAT ACT CAA TAT GAC TGG AC A 247 9 

Leu Val Met Asn Asp Asp Val Ala Asn Asn Thr Gin Tyr Asp Trp Thr 

675^ ■ " 680 685 

TCA TCC TAT GCG GCC AGT GCG CCG GTT GAT AAT CAG GGT AAA GTC AAT 2 527 

Ser Ser Tyr Ala Ala Ser Ala Pro Val Asp Asn Gin Gly Lys Val Asn 
690 695 700 705 

ATT GCC TAT AAA ACC TAT GGT AGC ACC GTC ACT GTG ACG GCA AAA AGT 257 5 

lie Ala Tyr Lys Thr Tyr Gly Ser Thr Val Thr Val Thr Ala Lys Ser 
710 715 720 

4 

AAA AAA TTC CCG AGT TAT ACG GCA ACA TAT CAA TTC AAA CCT AAT TTG 2623 
Lys Lys Phe Pro Ser Tyr Thr Ala Thr Tyr Gin Phe Lys Pro Asn Leu 
725 730 735 

TGG GTG TTC TCC GGC ACC ATG TCA CTG CAA TCA AGT GTC GAG GCG AGT 2671 
Trp Val Phe Ser Gly Thr Met Ser Leu Gin Ser Ser Val Glu Ala Ser 
740 745 750 

CGA AAT TGC CAG CGC ACT GAT TTT ACT GCG CTG ATC GAG TCC GCA CGC 2719 
Arg Asn Cys Gin Arg Thr Asp Phe Thr Ala Leu lie Glu Ser Ala Arg 
755 760 765 

GCC AGT AAT GGT TCG CGT TCA CCA GAC GGT ACT CTG TGG GGA GAG TGG 2767 
Ala Ser Asn Gly Ser Arg Ser Pro Asp Gly Thr Leu Trp Gly Glu Trp 
770 775 780 785 

GGA AGT TTG GCA ACC TAT GAT AGC GCT GAG TGG CCA TCG GGT AAC TAT 2 815 

Gly Ser Leu Ala Thr Tyr Asp Ser Ala Glu Trp Pro Ser Gly Asn Tyr 
790 795 800 

TGG ACT AAA AAG ACC AGT ACA GAT TTT GTC ACT ATG GAT ATG ACC ACC 2 863 

Trp Thr Lys Lys Thr Ser Thr Asp Phe Val Thr Met Asp Met Thr Thr 
805 810 815 

GGT GAC ATA CCA ACA TCT GCG GCT ACG GCG TAT CCG CTG TGT GCG GAG 2911 
Gly Asp lie Pro Thr Ser Ala Ala Thr Ala Tyr Pro Leu Cys Ala Glu 
820 825 830 

CCG CAA TAGTGCTAAA TACCAATCTT GCGGCCCAGC AAACTGGCAC CTTTAGCGTG 2967 
Pro Gin 
835 

ACCATCTGGC CC ATACAGTG ATTGGCCGTG GCGCGTATTC AAAACCGCCA GCGCCTGAGT 3027 

GTTATGCTCA ATATGCTGTT GCAGCAAAAG CCCGTTATGC AGGTTGCCGT AGCGCACCGT 3087 

TCGGCCAGTT CCAAAATACG CTGCCAGCGC TCAGCTAGCG CAGGAACGTT GCTGTAGGGC 3147 

GCTTGAATAT TTATGTTTTT TTCGGTGGTG AGCCGGGTCT GGTCCAGATA AGC C AAGGTG 32 07 

CCAAAATTGA ACTTTTTTGT TCAGTGACGC CTTGCAACAC GATACCTTGA ATCCGACCGG 32 67 

' AGCACAGCAG TTGCTGCTCT TGTGCTACTA CGGTTTTCAG GGATTCAAGC AGTTCCAGTT 3327 

GCTGGTCCAG ATTAGTTTGT AATCTTTCCA CCACCACCTA TCCTTTTACG GTTAATAATT 3 3 87 

TTACGGTCAA CGATTGTTGT GACGTTTAGC TATTCTTCAG GTCATCGGCA ACATTTTTGA 3447 
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GCAAGGCATC GGCAATTTTA CCGGCGTCCA TGGTCAGTTG GCCTGAACGG ATCGCCTGTT 
TTAAGGTTTC GACACGTTCT ACATTGATGT CCTGGCTGCC CGGTTGCATC AATTTTGCCT 

I 

GCGCGTCGCT CAATTTAACC TCAGTACCAC TTA 

I 
i 

! (2 J INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2220 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : unknown 
i (D) TOPOLOGY: unknown 

j (ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 536.. 1024 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

GGATCACATC ATCAATACCG AAGCCCAAGA GATTAGCCAG TGTGCCAGAA AAATGGCTCG 60 

ATGGGACGTT GGTGGAAGGA AGCAAATATT GCTTACGGCA CGAAAACGCA TGATAGATGA 120 

GCTTCAGATG TATTTGCCAG GACTGGGAAG TCACGTGGGT AATTACTGTG ACATCCAGTA 180 

ATAAAACAGA GCCTCTATTA AAGGAGCTTC CCAATTTGAA ATCAGAAAAA TTACATCATA 240 

AACATGGGTG TCCAGAAGTC AGTCGGCGAT ATATCCATTT AAAGAGCATT GAGCTATGAC 3 00 

CAGTATTCAT CAACTACAGA ACAAAAATAC AGGAATAAGT GACTGATGGG ATAAAGCTGA 360 

GGTAAGCTCA CAGTACTGTA TCAATATCCA TATTTACATA TATATCATGG ATTTGGCATT 42 0 

ATATCATCAG CCATGTCAGT GATATGGTTA TTGTATTAGT ATTGTTATAA CAATCTGGAT 4 80 

TATTTTTATG AAAAAGACAT TACTAGCTAG TTCTCTAATA GCCTGTTTAT CAATT GCG 53 8 

Ala 
1 

TCT GTT AAT GTG TAC GCT GCG AGT GAA AGT AGT ATT TCT ATT GGT TAT 586 
Ser Val Asn Val Tyr Ala Ala Ser Glu Ser Ser lie Ser lie Gly Tyr 
5 10 15 

GCG CAA AGC CAT GTA AAA GAA AAT GGG TAT AC A TTG GAT AAT GAC CCT 634 
Ala Gin Ser His Val Lys Glu Asn Gly Tyr Thr Leu Asp Asn Asp Pro 
20 25 30 



3507 
3567 
3600 



AAA GGT TTT AAC CTG AAG TAC CGT TAT GAA CTC GAT GAT AAC TGG GGA 
Lys Gly Phe Asn Leu Lys Tyr Arg Tyr Glu Leu Asp Asp Asn Trp Gly 
35 40 45 



682 
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GTA ATA GGT TCG TTT GCT TAT ACT CAT CAG GGA TAT GAT TTC TTC TAT 73 0 

Val lie Gly Ser Phe Ala Tyr Thr His Gin Gly Tyr Asp Phe Phe Tyr 
50 55 - 60 65 

GGC AGT AAT AAG TTT GGT CAT GGT GAT GTT GAT TAC TAT TCA GTA AC A 77 8 

Gly Ser Asn Lys Phe Gly His Gly Asp Val Asp Tyr Tyr Ser Val Thr 
70 75 80 

ATG GGG CCA TCT TTC CGC ATC AAC GAA TAT GTT AGC CTT TAT GGA TTA 326 
Met Gly Pro Ser Phe Arg lie Asn Glu Tyr Val Ser Leu Tyr Gly Leu 
85 90 95 

CTG GGG GCC GCT CAT GGA AAG GTT AAG GCA TCT GTA TTT GAT GAA TCA 874 
Leu Gly Ala Ala His Gly Lys Val Lys Ala Ser Val Phe Asp Glu Ser 
100 105 HO 

ATC AGT GCA AGT AAG ACG TCA ATG GCA TAC GGG GCA GGG GTG CAA TTC 922 
lie Ser Ala Ser Lys Thr Ser Met Ala Tyr Gly Ala Gly Val Gin Phe 
115 120 125 

AAC CCA CTT CCA AAT TTT GTC ATT GAC GCT TCA TAT GAA TAC TCC AAA 970 
Asn Pro Leu Pro Asn Phe Val lie Asp Ala Ser Tyr Glu Tyr Ser Lys 
130 135 140 145 

CTC GAT AGC ATA AAA GTT GGC ACC TGG ATG CTT GGT GCA GGG TAT CGA 1018 
Leu Asp Ser lie Lys Val Gly Thr Trp Met Leu Gly Ala Gly Tyr Arg 
150 155 160 

TTC TAATCATCTC AGATAGTGAA AACCCACCTG AGTGAAGTGA ACCCCATTTA 1071 
Phe 

TTGGACACTT TTCCTGGCGG TTGACATGGC CTGATTTCGG TACTGCACCG GACTCAGGCC 1131 

GTTTAATTTT ACTTTGATCC TTTCGTTGTT GTAGTAATGG ATATACTCAT CCACCGCTTT 1191 

TTTCAGTTGT TCTACATCTT CGTATTTTTC ATTGTGCCAG CATTCAGTCT TCAGCAGACC 1251 

AAAAAAGTTT TCTATCACAG CATTATCCAG GCAGTTGCCC TTGCGCGACA TACTTTGCTT 1311 

TACTTCGCCA GACCCCAGCC TTTTCTTATA GCTTGCCATC TGATATTGCC AGCCCTGATC 1371 

CGAGTGAAGT ACAGGTTCAT CGCCTGAGTT CAACTTCTGT AGCGCATCAT CAAGCATTTT 1431 

ATCAATCAGG TTCATTCCGG GATGCGTATC CATCTGCCAG GCAACGACTT CGC^GTTATA 1491 

C AG ATC C AGC ACGGGTGACA GATACAGCTT TTTACCCCTG ACGTTGAACT CGGTCACATC 1551 

GTTACCCACT TCTGGTTAGG GGCTTCGGCA GTAAATTTTC GAGCAAGTAT ATTAGGGACC 1611 

ACTTTACCGT AGGCACCCTG ATATGACTGA TATTTTTTAC GACGCAAGTT AGATGCAAGC 1671 

TGCTGTTGCC GCATGAGTTT TCGTACGGTT TTATGGTTAA GACTCCCGCC CTCATTGCGT 1731 

AGGGCCAGCG TTATTCTGCG GT AAC CAT AG CGACCTTTAT GATGGTGAAA CAGGGTTTTT 1791 

ATTCTTTGTT TCTCATCCGC ATAACTCTCT TCACGACCAC TGGATTTTAC CTGCCAGTAG 1851 

AAGGTGCTGC GCGGAAGACC GGCCACGTAA AGCAAGGTCG CCAGTTTATA CAGATGCCTT 1911 
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AATTCAGTGA TTATTCGCGT TTTTTCCGCT GCTTCTCTTA CAGGTGGTAT TCACTGAGTG 
CCACCGATAA TGCGCAGGCA AAGTCATTAA CGACCCCCGC CGCTCACCCT GAGCATGGTC 
GTTGATGGCT TTTATATTTT CCATAGAGCA GAGGATGATT CTTTATGTCC CGAGTGAACT 2091 
GGGGTGAACG GTTATCCCGG TTTGCCGCTG AATGGCAACG GACGGGAATA TCCCCTAAAG 2151 
AGTGGTGTGA GAGAGAAGGT TATTCGTGGG GAACAGCGAA AGCGTATATT TCGATAAAAG 2211 
CAGCGAAAG 



1971 
2031 



4 

2220 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6545 base pairs 

(B) TYPE: nucleic acid 
JC) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 3630.. 4820 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 3: 

ATCTTTTCCG TGGTATGACC AGAACATAAA GTTTTTGCTG CCCCAACGCC GGTGTCAGGC 60 

GCATAACGCC TTCCAGCCGA TCTGCACTCA TCACGCCTGG TTCCTAGTAG GTGAAATAAC 120 

TGCTGGGGAA GAAGGCTGAT GGGGTCATGT TCTGATCAAG AATCAGCACG TTCGGCGCAA 180 

AAACGCTGGT TTGTTTGTTC ACTTCGCTGG TCAGCGTCAG GGTCAGTTCG CCAATGTTTG 24 0. 

CCGGGACGCT GTACGCAGCA ACCGGACCAC TGATGCCGGG AACGTTCAGT TGTTGGCCGC 300 

CGGTCGCCAG TTGGGTGGTC TGGGTTTTAG ATTGATCGAC CGGTGTCCAG GTGAGTTGTT 360 

GCAGCGCAGC AGATGGAATG GCTGGCGCGT CGCTGGTGTT TTGCGGTACG TAGTTAACAT 420 

CGGCAAGGCT AATTCCAGGC GCGCTTGCCA GTAACCCTGC TGATAAACAG AGGACGATGA 480 

GACTTTTATT CATTTTCATT GTTTTCACCT CAAAATCTGG AGCTCAGCGG TAGCCAGGCA 5 40 

ATAGCGCGCT AAACCCGATA ATCAGAGGGG CTTTCGCCCC TTCAGATAAT GACAACCTGT 600 

TTTTATGCCG GATGCGGCGT AAACGCCTTA TCCGGCCTAC ATTTGACAGC CGTTGTAGGC 660 

CTGATAAGAC GCGCAAGCGT CGCATCAGGC GTTGGTTGCC GAATGCGGCG TAAACGCCTT 72 0 

ATCCGGCCCA GGTTTTGCTA TTACCACCAG ATT TCCATCT GGG CACCGAA GGTCCACTCG 7 80 

TCGCTGTCGC CACGACCGAA GCTGCCGCCG TTGAAATCAG CAGGAACGGC TTTGCCGAAG 840 
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900 

1020 
1080 



TTCGCGTTGT TATCAGCGTT ACCGGTGTAG TCGTAACCCC ATTTCTCATC CCACTTGGCG 
TAGGTTGCGA AGACACGAAT AGCCGGGCGT GACCAGATGC TGTCGCCAGC CTGCCATTGT 
TGTGCGAGGG TAATTTTGTA CTGATTGTTC TTGTCGCCGG TGCGCTGGGA TTCGACGTTG 
TCGTAGCCGA TTTCCATCAC GGTGCTCATG ATTGGCGTCC ACTTGTACAT CGGGCGAATA 
CCGACGGTCC ACCACTTGGT GCCGTTGTCG TTATCCCAGT TGATATCCTG GTACATACCC 1140 
ACGTACATCA TGTCCCAGTT GTCGCCCATG GAGATCGCAC CGTGGTCGAG GATACGCAGC 1200 
ATGTGACCGT TGTTGTTGAT ATTGTAGGCA AATTTTTCGT TATCAAATGC AACGCCAGAA 1260 
'cCCTGCGACA GCCCTTTACC CTGCGAGGTC ATCGAGTCAG TAGCGTACTG AACAACAAAC 1320 
TTGTTAAAGC CCTTCAGGAC ACTCTGAGTA TGTTCAGCAG TGAATAACCA GCCGTCTTTC 1380 
GATGCGCCAT CAACCAGACG ATAGTTATCA CGCAAGTTGG CACGACCGTA GTCGACACCC 1440 
AGTTCTAATG TGCCGCCCGG GTTGATTTCC ATCTGCGCTA AACGCACATC GAAAACGTCG 1500 
TTCGCGGTTT CGTTGGTATA GTCATAAATA TTGTTGCTGG CGAAAGAGGA AGAACCACCA 1560 
GCTTCAGAGG AGCGGGTTGC TGCCAGAGAG AGTTTACCGA AGCCAACATC GATGTTTTCC 1620 
AGACCGGCAC CAGGACCAGA AATATCCCAG TAGTAGAAGT CGATCATATG AACGTCATGA 1680 
CGTTGGTAGA AGCGCTTACC TGCCCAGATG GTGGAGCCTG GCAGCCATTC GATCAGGTTT 
TTACCCTGCA CGTTTGCTTC ACGGAAGGCC GGATCGGTAG CTTCCCAGTC ATTCTGTTGT 
GCGACGGAAT AGGCCACGTT AGTGTCGAAA TAGAAGCTCT TATCGCCCTC TTTCCACACT 
TCCTGACCCA ATTTTAATTC AGCATAAGTT TCACATTCGT TGCCAAGACG GTATTTACTT 
TGAGCACCGG TAGTCTGGAA ACACTGTTGT TCACCGCCGC TACCTGTCCA ACCAATACCG 
GAACGTGCAT AGCCGTGGAA ATCAACAGCC ATTGCCTGAG CAGACATTAC GCCCGCTGCG 
ACGGCAACCG CCAGAGGAAG TTTGCGCAGA GTAATCATCA TTCTATCTCC TGAGTCATTG 
CTTTTCTTTT TTCACATCAC CTGTGACAGG CTTTGTGTGT TTTGTGGGGT GCTTAAACGC 
CCGGCTCCTT ATGCAGTCGA CGACATGCAG TGCCATCCTC ACGGAACAGA TGGCAACGCT 
CTGGCGGCAG GCCGATAGCG AATGTGGCAC CTTCTTCTAC CAACACCACG TCGTTCTGGC 
GGTACACCAG GTTTTGACGA ATGGAAGGGA TCTGGATATG GATTTGAGTT TCGTTGCCGA 
GTTGCTCGAC GACCTGAACT TCACCCTCAA GGATGACGTC AGCGATATCA CTCGGCAGTA 
GATGTTCCGG GCGAATACCC AGCGACATAT TGGCTCCAAC CTGGACATCA CGGCTTTCAA 
CTGGCAGCCA GACTTGCTGA CGATTTGGCA TCGGCAGCTC CACCTGCACT TGATCGATTG 
CGGTGGCGGT CACTTTTACC GGCAGGAGTT CATCTTTGGC GAACCGATAA ATCCGGCGAC 
AAAACGGTCT GCCGGATAGT GGTACAGCTA GCGGTTTCCC AACCTGCGCC ACGCGACCGG 
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CGTCCAGCAC 


CACGATTTTG 


TCGGCCAGCG 


TCATCGCTTC 


GACCTGATCG 


TGGGTGACGT 


2700 


AAATCATTGT 


GCGGCCCAGG 


CGTTTATGCA 


GACGGGAGAT 


TTCGATACGC 


ATTTGCACAC 


2760 


GCAGTGCAGC 


ATCGAGGTTG 


GAGAGCGGTT 


CATCGAGCAA 


AAATACGCTT 


GGCTCGGCCA 


2820 


CCAGCGTACG 


GCCAATCGCC 


ACACGCTGAC 


GCTGACCACC 


GGAGAGCGCT 


TTCGGTTTGC 


2880 


GATCCAGCAA 


ATGCGCCAGT 


TGTAGCACTT 


CCGCCACCTG 


GTTAACGCGT 


TGGTTAATCA 


2940 


CCTCTTTTTT 


TGCGCCAGCA 


GGTTTCAGGC 


CAAATGACAT 


GTTTTCTGCT 


ACTGACAGGT 


4 

3000 


' GGGGATAGAG 


CGCGTAAGAC 


TGAAACACCA 


TACCAACGCC 


GCGTTCTGCT 


GGCGGAGTGT 


3060 


CATTCATCCG 


TTTCTCACCG 


ATGAACAGGT 


CGCCGCTGGT 


GATCGTCTCA 


AGCCCGGCAA 


3120 


TCATGCGCAG 


TAAAGTCGAT 


TTACCGCAGC 


CAGACGGTCC 


GACAAACACC 


ACGAATTCAC 


3180 


CTTCATGGAT 


ATCGAGATTG 


ATATCTTTCG 


ATACCACGAC 


CTCGCCCCAG 


GC TTTCGTTA 


3240 


CATTTTGCAG 


CTGTACGCTC 


GCCATGCCCT 


TCTCCCTTTG 


TAACAACCTG 


TCATCGACAG 


3300 


CAACATTCAT 


GATGGGCTGA 


CTATGCGTCA 


TCAGGAGATG 


GCTTAAATCC 


TCCACCCCCT 


3360 


GGCTTTTTTA 


TGGGGGAGGA 


GGCGGGAGGA 


TGAGAACACG 


GCTTCTGTGA 


ACTAAACCGA 


3420 


GGTCATGTAA 


GGAATTTCGT 


GATGTTGCTT 


GCAAAAATCG 


TGGCGATTTT 


ATGTGCGCAT 


3480 


CTCCACATTA 


CCGCCAATTC 


TGTAACAGAG 


ATCACACAAA 


GCGACGGTGG 


GGCGTAGGGG 


3540 


CAAGGAGGAT 


GGAAAGAGGT 


TGCCGTATAA 


AGAAACTAGA 


GTCCGTTTAG 


GTGTTTTCAC 


3600 


GAGCACTTCA 


CCAACAAGGA 


CCATAGATT ATG AAA ATA 


AAA ACA GGT 


GCA CGC 


3653 



Met Lys lie Lys Thr Gly Ala Arg 
1 , 5 

ATC CTC GCA TTA TCC GCA TTA ACG ACG ATG ATG TTT TCC GCC TCG GCT 3701 
lie Leu Ala Leu Ser Ala Leu Thr Thr Met Met Phe Ser Ala Ser Ala 
10 15 20 

CTC GCC AAA ATC GAA GAA GGT AAA CTG GTA ATC TGG ATT AAC GGC GAT 37 49 

Leu Ala Lys He Glu Glu Gly Lys Leu Val He Trp He Asn Gly Asp 
25 30 35 40 

AAA GGC TAT AAC GGT CTC GCT GAA GTC GGT AAG AAA TTC GAG AAA GAT 3797 
Lys Gly Tyr Asn Gly Leu Ala Glu Val Gly Lys Lys Phe Glu Lys Asp 
45 50 55 

ACC GGA ATT AAA GTC ACC GTT GAG CAT CCG GAT AAA CTG GAA GAG AAA 3 845 

Thr Gly He Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys 
60 65 70 

TTC CCA CAG GTT GCG GCA ACT GGC GAT GGC CCT GAC ATT ATC TTC TGG 3 893 

Phe Pro Gin Val Ala Ala Thr Gly Asp Gly Pro Asp He lie Phe Trp 

75 80 85 
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GCA CAC GAC CGC TTT GGT GGC TAC GCT CAA TCT GGC CTG TTG GCT GAA 3941 
Ala His Asp Arg Phe Gly Gly Tyr Ala Gin Ser Gly Leu Leu Ala Glu 

90 ...... . .;; ~ 95 . 100 

ATC ACC CCG GAC AAA GCG TTC CAG GAC AAG CTG TAT CCG TTT ACC TGG 3989 
He Thr Pro Asp Lys Ala Phe Gin Asp Lys Leu Tyr Pro Phe Thr Trp 
105 HO 115 120 

GAT GCC GTA CGT TAC AAC GGC AAG CTG ATT GCT TAC CCG ATC GCT GTT 4vj»7 
Asp Ala Val Arg Tyr Asn Gly Lys Leu He Ala Tyr Pro He Ala Val 
125 130 135 

4 

GAA GCG TTA TCG CTG ATT TAT AAC AAA GAT CTG CTG CCG AAC CCG CCA 4085 
Qlu Ala Leu Ser Leu He Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro 
140 145 150 

AAA ACC TGG GAA GAG ATC CCG GCG CTG GAT AAA GAA CTG AAA GCG AAA 413 3 

Lys Thr Trp Glu Glu He Pro Ala Leu Asp Lys Glu Leu Lys Ala Lys 
155 160 165 

GGT AAG AGC GCG CTG ATG TTC AAC CTG CAA GAA CCG TAC TTC ACC TGG 4181 
Gly Lys Ser Ala Leu Met Phe Asn Leu Gin Glu Pro Tyr Phe Thr Trp 
170 175 180 

CCG CTG ATT GCT GCT GAC GGG GGT TAT GCG TTC AAG TAT GAA AAC GGC 4229 
Pro Leu He Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly 
185 1^0 195 200 

AAG TAC GAC ATT AAA GAC GTG GGC GTG GAT AAC GCT GGC GCG AAA GCG 4277 
Lys Tyr Asp He Lys Asp Val Gly Val Asp Asn Ala Gly Ala Lys Ala 
205 210 215 

GGT CTG ACC TTC CTG GTT GAC CTG ATT AAA AAC AAA CAC ATG AAT GCA 432 5 

Gly Leu Thr Phe Leu Val Asp Leu He Lys Asn Lys His Met Asn Ala 
220 225 230 

GAC ACC GAT TAC TCC ATC GCA GAA GCT GCC TTT AAT AAA GGC GAA ACA 4373 
Asp Thr Asp Tyr Ser He Ala Glu Ala Ala Phe Asn Lys Cly Glu Thr 
235 240 245 

GCG ATG ACC ATC AAC GGC CCG TGG GCA TGG TCC AAC ATC GAC ACC AGC 4421 
Ala Met Thr He Asn Gly Pro Trp Ala Trp Ser Asn He Asp Thr Ser 
250 255 260 

AAA GTG AAT TAT GGT GTA ACG GTA CTG CCG ACC TTC AAG GGT CAA CCA 4469 
Lvs Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gin Pro 
265 270 275 280 

TCC AAA CCG TTC GTT GGC GTG CTG AGC GCA GGT ATT AAC GCC GCC AGT 4517 
Ser Lys Pro Phe Val Gly Val Leu Ser Ala Gly He Asn Ala Ala Ser 
285 290 295 

CCG AAC AAA GAG CTG GCG AAA GAG TTC CTC GAA AAC TAT CTG CTG ACT 4565 
Pro Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr Leu Leu Thr 
300 ,305 310 

GAT GAA GGT CTG GAA GCG GTT AAT AAA GAC AAA CCG CTG GGT GCC GTA 4613 
Asp Glu Gly Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val 
315 320 325 
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GCG CTG AAG TCT TAC GAG GAA GAG TTG GCG AAA GAT CCA CGT ATT GCC 4 661 
Ala Leu Lys Ser Tyr Glu Glu Glu Leu Ala Lys Asp Pro Arg lie Ala 
330 335 340 

GCC ACC ATG GAA AAC GCC CAG AAA GGT GAA ATC ATG CCG AAC ATC CCG 4709 
Ala Thr Met Glu Asn Ala Gin Lys Gly Glu I?e Met Pro Asn lie Pro 
345 350 355 360 

CAG ATG TCC GCT TTC TGG TAT GCC GTG CGT ACT GCG GTG ATC AAC GCC 4757 
Gin Met Ser Ala Phe Trp Tyr Ala Val Arg Thr Ala Val lie Asn Ala ' ' 
365 370 375 

GCC AGC GGT CGT CAG ACT GTC GAT GAA GCC CTG AAA GAC GCG CAG ACT 4805 
Ala Ser Gly Arg Gin Thr Val Asp Glu Ala Leu Lys Asp Ala Gin Thr 
380 385 390 

CGT ATC ACC AAG TAATGCTGTG AAATGCCGGA TGCGGCGTGA ACGCCTTGTC 4 857 
Arg lie Thr Lys 
395 

CGGCCTACAA AACCGAAACG TATGTAGGCC TGATAAGACG CGTCAGCGTC GCATCAGGCA 4 917 

GTTGTTGTCG GATAAGGCGT GAAAGCCTTA TCCGTCCTGG AATGAGGAAG AACCCCATGG 4 977 

ATGTCATTAA AAAGAAACAT TGGTGGCAAA GCGACGCGCT GAAATGGTCA GTGCTAGGTC 5037 

TGCTCGGCCT GCTGGTGGGT TACCTTGTTG TTTTAATGTA CGCACAAGGG GAATACCTGT 5097 

TCGCCATTAC CACGCTGATA TTGAGTTCAG CGGGGCTGTA TATTTTCGCC AATCGTAAAG 5157 

CCTACGCCTG GCGCTATGTT TACCCGGGAA TGGCTGGAAT GGGATTATTC GTCCTCTTCC 5217 

CTCTGGTCTG CACCATCGCC ATTGCCTTCA CCAACTACAG CAGCACTAAC CAGCTGACTT 5277 

TTGAACGTGC GCAGGAAGTG TTGTTAGATC GCTCCTGGCA AGCAGGCAAA ACCTATAACT 53 37 

TTGGTCTTTA CCCGGCGGGC GATGAGTGGC AACTGGCGCT CAGCGACGGC GAAACCGGCA 53 97 

AAAATTACCT CTCCGACGCT TTTAAATTTG GCGGCGAGCA AAAACTGCAA CTGAAAGAAA 5457 

CGACCGCCCA GCCCGAAGGC GAACGCGCGA ATCTGCGCGT GATTACCCAG AATCGTCAGG 5517 

CGCTGAGTGA CATTACCGCC ATTCTGCCGG ATGGCAACAA AGTGATGATG AGCTCCCTGC 5577 

GCCAGTTTTC TGGCACGCAG CCGCTCTACA CACTCGACGG TGACGGCACG TTGACGAATA 5637 

ATCAGAGCGG CGTGAAATAT CGTCCGAATA ACCAAATTGG CTTTTACCAG TCCATTACCG 5697 

CCGACGGCAA CTGGGGTGAT GAAAAGCTAA GCCCCGGTTA CACCGTGACC ACCGGCTGGA 57 57 

AAAACTTTAC CCGCGTCTTT ACCGACGAAG GCATTCAGAA ACCGTTCCTC GCCATTTTCG 5817 

TCTGGACCGT GGTGTTCTCG CTGATCACTG TCTTTTTAAC GGTGGCGGTC GGCATGGTTC 5877 

TGGCGTGTCT GGTGCAGTGG GAAGCGTTGC GCGGCAAAGC GGTCTATCGC GTCCTGCTGA 59 37 

TTCTGCCCTA CGCGGTGCCA TCGTTCATTT CAATCTTGAT TTTCAAAGGG TTGTTTAACC 5997 
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AGAGCTTCGG TGAAATCAAC ATGATGTTGA GCGCGCTGTT TGGCGTGAAG CCCGCCTGGT 6057 

TCAGCGATCC GACCACCGCC CGCACGATGC TAATTATCGT CAATACCTGG CTGGGTTATC 6117 

CGTACATGAT GATCCTCTGC ATGGGCTTGC TGAAAGCGAT TCCGGACGAT TTGTATGAAG 6177 

CCTCAGCAAT GGATGGCGCA GGTCCGTTCC AGAACTTCTT TAAGATTACG CTGCCGCTGC 6237 

TGATTAAACC GCTGACGCCG CTGATGATCG CCAGCTTCGC CTTTAACTTT AACAACTTCG 6297 

TGCTGATTCA ACTGTTAACC AACGGCGGCC CGGATCGTCT TGGCACGACC ACGCCAGCCG 63 57 

4 

GTTATACCGA CCTGCTTGTT AACTACACCT ACCGCATCGC TTTTGAAGGC GGCGGGGGTC 6417 

AGGACTTCGG TCTGGCGGCA GCAATTGCCA CGCTGATCTT CCTGCTGGTG GGTGCGCTGG 6477 

CGATAGTGAA CCTGAAAGCC ACGCGAATGA AGTTTGATTA AGGGAGATAA CAAAAATGGC 6537 

AATGGTCC 6545 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 588 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Lys lie Lys Thr Gly Ala Arg lie Leu Ala Leu Ser Ala Leu Thr 
1 5 " .10 15 

t 

Thr Met Met Phe Ser Ala Ser Ala Leu Ala Lys lie Glu Glu Gly Lys 
20 25 30 

Leu Val lie Trp lie Asn Gly Asp Lys Gly Tyr Asn Gly Leu Ala Glu 
35 40 45 

Val Gly Lys Lys Phe Glu Lys Asp Thr Gly lie Lys Val Thr Val Glu 
50 55 60 

His Pro Asp Lys Leu Glu Glu Lys Phe Pro Gin Val Ala Ala Thr Gly 
65 70 75 80 

Asp Gly Pro Asp lie He Phe Trp Ala His Asp Arg Phe Gly Gly Tyr 
85 90 95 

Ala Gin Ser Gly Leu Leu Ala Glu He Thr Pro Asp Lys Ala Phe Gin 
100 105 110 

Asp Lys Leu Tyr Pro Phe Thr Trp Asp Ala Val Arg Tyr Asn Gly Lys 
US 120 125 
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Leu lie Ala Tyr Pro lie Ala Val Glu Ala Leu Ser Leu lie Tyr Asn 

135 140 

Lys Asp Leu Leu Pro Asn Pro Pro Lys Thr Trp Glu Glu He Pro Ala 
145 150 155 160 

Leu Asp Lys Glu Leu Lys Ala Lys Gly Lys Ser Ala Leu Met Phe Asn 
165 170 175 

Leu Gin Glu Pro Tyr Phe Thr Trp Pro Leu lie Ala Ala Asp Gly Gly 
180 185 190 . 

Tyr Ala Phe Lys Tyr Glu Asn Gly Lys Tyr Asp He Lys Asp Val Gly 
195 200 205 

Val Asp Asn Ala Gly Ala Lys Ala Gly Leu Thr Phe Leu Val Asp Leu 
2i0 215 220 

lie Lys Asn Lys His Met Asn Ala Asp Thr Asp Tyr Ser He Ala Glu 
225 230 235 240 

Ala Ala Phe Asn Lys Gly Glu Thr Ala Met Thr He Asn Gly Pro Trp 
2 <5 250 255 

Ala Trp Ser Asn He Asp Thr Ser Lys Val Asn Tyr Gly Val Thr Val 
260 265 270 

Leu Pro Thr Phe Lys Gly Gin Pro Ser Lys Pro Phe Val Gly Vai Leu 
275 280 285 

Ser Ala Gly He Asn Ala Ala Ser Pro Asn Lys Glu Leu Ala Lys Glu 
290 295 300 

Phe Leu Glu Asn Tyr Leu Leu Thr Asp Glu Gly Leu Glu Ala Val Asn 
305 310 .315 320 

Lys Asp Lys Pro Leu Gly Ala Val Ala Leu Lys Ser Tyr Glu Glu Glu 
325 330 335 

Leu Ala Lys Asp Pro Arg He Ala Ala Thr Met Glu Asn Ala Gin Lys 
340 345 350 

Gly Glu He Met Pro Asn He Pro Gin Met Ser Ala Phe Trp Tyr Ala 
355 360 365 

Val Arg Thr Ala Val He Asn Ala Ala Ser Gly Arg Gin Thr Val Asp 
370 375 380 

Glu Ala Leu Lys Asp Ala Gin Thr Arg He Thr Lys Val Pro Thr Leu 
385 390 395 400 

Thr Gly He Leu Val Asn Gly Gin Asn Phe Ala Thr Asp Lys Gly Phe 
405 410 415 

Pro Lys Thr He Phe Lys Asn Ala Thr Phe Gin Leu Gin Met Asp Asn 
420 425 430 

Asp Val Ala Asn Asn Thr Gin Tyr Glu Trp Ser Ser Ser Phe Thr Pro 
435 440 445 
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Asn Val Ser Val Asn Asp Gin Gly Gin Val Thr lie Thr Tyr Gin Thr 
450 455 460 

Tyr Ser Glu Val Ala Val Thr Ala Lys Ser Lys Lys Phe Pro Ser Tyr 
465 470 475 480 

Ser Val Ser Tyr Arg Phe Tyr Pro Asn Arg Trp He Tyr Asp Gly Gly 
485 490 495 

Ara Ser Leu Val Ser Ser Leu Glu Ala Ser Arg Gin Cys Gin Gly Ser 
500 505 510 . 

Asp Met Ser Ala Val Leu Glu Ser Ser Arg Ala Thr Asn Gly Thr Arg 
515 520 525 

Ala Pro Asp Gly Thr Leu Trp Gly Glu Trp Gly Ser Leu Thr Ala Tyr 
530 535 540 



Se 



r Ser Asp Trp Gin Ser Gly Glu Tyr Trp Val Lys Lys Thr Ser Thr 



545 550 



555 560 



Asp Phe Glu Thr Met Asn Met Asp Thr Gly Ala Leu Gin Pro Gly Pro 

V 565 570 575 

Ala Tyr Leu Ala Phe Pro Leu Cys Ala Leu Ser lie 
580 585 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 568 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



Met Lys He Lys Thr Gly Ala Arg He Leu Ala Leu Ser Ala Leu Thr 
1 5 10 

Thr Met Met Phe Ser Ala Ser Ala Leu Ala Lys He Glu Glu Gly Lys 

20 25 30 

Leu Val He Trp He Asn Gly Asp Lys Gly Tyr Asn Gly Leu Ala Glu 
35 40 45 

Val Gly Lys Lys Phe Glu Lys Asp Thr Gly He Lys Val Thr Val Glu 
50 55 60 

His Pro Asp Lys Leu Glu Glu Lys Phe Pro Gin Val Ala Ala Thr Gly 
65 ™ 75 ■ 

Asp Gly Pro Asp He He Phe Trp Ala His Asp Arg Phe Gly Gly Tyr 
85 90 
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Ala Gin Ser Gly Leu Leu Ala Glu lie Thr Pro Asp Lys Ala Phe Gin 
100 105 110 

Asp Lys Leu Tyr Pro Phe Thr Trp Asp Ala Val Arg Tyr Asn Gly Lys 
115 120 125 

Leu lie Ala Tyr Pro lie Ala Val Glu Axa Leu Ser Leu lie Tyr Asn 
130 135 140 

Lys Asp Leu Leu Pro Asn Pro Pro Lys Thr Trp Glu Glu lie Pro Ala 
145 150 155 160 

Leu Asp Lys Glu Leu Lys Ala Lys Gly Lys Ser Ala Leu Met Phe Asn 
165 170 175 

Leu Gin Glu Pro Tyr Phe Thr Trp Pro Leu lie Ala Ala Asp Gly Gly 
180 185 190 

Tyr Ala Phe Lys Tyr Glu Asn Gly Lys Tyr Asp lie Lys Asp Val Gly 
- 195 200 - 205 

Val Asp Asn Ala Gly Ala Lys Ala Gly Leu Thr Phe Leu Val Asp Leu 
210 215 220 

lie Lys Asn Lys His Met Asn Ala Asp Thr Asp Tyr Ser lie Ala Glu 
225 230 235 240 

Ala Ala Phe Asn Lys Gly Glu Thr Ala Met Thr He Asn Gly Pro Trp 
245 250 255 

Ala Trp Ser Asn He Asp Thr Ser Lys Val Asn Tyr Gly Val Thr Val 
260 265 270 

Leu Pro Thr Phe Lys Gly Gin Pro Ser Lys Pro Phe Val Gly Val Leu 
275 280 , 285 

Ser Ala Gly He Asn Ala Ala Ser Pro Asn Lys Glu Leu Ala Lys Glu 
290 295 300 

Phe Leu Glu Asn Tyr Leu Leu Thr Asp Glu Gly Leu Glu Ala Val Asn 
305 310 315 320 

Lys Asp Lys Pro Leu Gly Ala Val Ala Leu Lys Ser ,yr Glu Glu Glu 
325 330 335 

Leu Ala Lys Asp Pro Arg He Ala Ala Thr Met Glu Asn Ala Gin Lys 
340 345 350 

Gly Glu He Met Pro Asn He Pro Gin Met Ser Ala Phe Trp Tyr Ala 
355 360 365 

Val Arg Thr Ala Val He Asn Ala Ala Ser Gly Arg Gin Thr Val Asp 
370 375 380 

Glu Ala Leu Lys Asp Ala Gin Thr Xsn Ser Ser Ser Val Pro Gly Arg 
385 390 395 400 
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Gly Ser lie Glu 



Ser Ser lie Ser 
420 

Tyr Thr Leu Asp 
435 

Glu Leu Asp Asp 
450 

Gin Gly Tyr Asp 
465 

Val Asp Tyr Tyr 



Tyr Val Ser Leu 
500 

Ala Ser Val Phe 
515 

Tyr Gly Ala Gly 
530 



Ala Ser Tyr Glu 
545 



Gly Arg Ala Ser 
405 

He Gly Tyr Ala 



Asn Asp Pro Lys 
440 

Asn Trp Gly Val 
455 

Phe Phe Tyr Gly 
470 

Ser Val Thr Met 
485 

Tyr Gly Leu Leu 



Asp Glu Ser He 
520 

Val Gin Phe Asn 
535 

Tyr Ser Lys Leu 
550 



Val Asn Val Tyr 
410 

Gin Ser His Val 
425 

Gly Phe Asn Leu 



He Gly Ser Phe 
460 

Ser Asn Lys Phe 
475 

Gly Pro Ser Phe 
490 

Gly Ala Ala His 
505 

Ser Ala Ser Lys 



Pro Leu Pro Asn 
540 

Asp Ser He Lys 
555 



Ala Ala Ser Glu 
415 

Lys Glu Asn Gly 
430 

Lys Tyr Arg Tyr 
445 

Ala Tyr Thr His 



Gly His Gly Asp 
480 

Arg He Asn Glu 
495 

Gly Lys Val Lys 
510 



Thr Ser Met Ala 
525 

Phe Val He Asp 



Val Gly Thr Trp 
560 



Met Leu Gly Ala Gly Tyr Arg Phe 
565 
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CLAIMS 

What is claimed is: 

5 1. A therapeutic delivery system for the delivery of 
a therapeutic agent, comprising: 

a) a therapeutic agent; and * > 

b) an invasion proficient bacterial protein which 
transports the composition across the gastrointestinal 

10 membrane barrier via transcytosis, and thereby 
increases the systemic bioavailability of said 
therapeutic agent. 

2. The delivery system according to Claim 1, wherein 
15 transcytosis via said bacterial protein increases the 

systemic bioavailability of said therapeutic agent by 
5-fold to 100-fold, 

3. The delivery system according to Claim 1, wherein 
20 said bacterial protein is invasin protein. 

4. The delivery system according to Claim 1, wherein 
said bacterial protein is attachment-invasion-locus 
protein. 

25 

5. The delivery system according to Claim 1, further 
comprising a carrier component. 

6. The delivery system according to Claim 5, wherein 
30 said carrier component is selected from the group 

consisting of liposomes, and polymer-based particles. 

7. The delivery system according to Claim 1, wherein 
said therapeutic agent and said invasion proficient 

35 bacterial protein are linked by a degradable peptide 
sequence . 
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8. A pharmaceutical composition comprising: 

a) a therapeutic agent; 

b) an invasion proficient bacterial protein which 

5 transports the composition across the gastrointestinal 
tract; and 

c) a carrier component. 

9. The composition according to Claim 8, wherein 
10 said bacterial protein is invasin or attachment- 
invasion-locus protein or a fragment thereof. 

10. The composition according to Claim 8, wherein 
said therapeutic agent and said bacterial protein are 

15 linked by a degradable peptide sequence. 

11. The composition according to Claim 8, wherein 
said carrier is selected from the group consisting of 
a liposome and a polymer particle. 

20 

12. A pharmaceutical composition comprising: a 
fusion protein including a therapeutic moiety and an 
invasion proficient bacterial protein i-o effect 
delivery of the composition across the 

25 gastrointestinal tract. 

13. The composition according to Claim 12, wherein 
said bacterial protein is invasin protein. 

30 14. The composition according to Claim 12, wherein 
said bacterial protein is attachment-invasion-locus 
protein . 

15. The composition according to Claim 12, further 
35 comprxsing a carrier component . 
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16 • The ^ composition according to Claim 15, wherein 
said carrier component is selected from the group 
consisting of liposomes and polymer-based particles. 

5 17. The composition accordiw to Claim 12, wherein 
said therapeutic moiety and said invasion proficient 
bacterial protein are linked by a degradable pep-tide 
sequence . 

0 18. A method of delivering a therapeutic agent 
through the gastrointestinal membrane barrier, 
comprising: orally administering a pharmaceutical 
composition comprising a therapeutic agent and an 
invasion proficient bacterial protein. 

19. The method according to Claim 18, wherein said 
invasion protein is invasin protein. 

20. The method according to Claim 18, wherein said 
invasion protein is attachment-invasion-locus protein. 

21. The method according to Claim' 18,, wherein said 
pharmaceutical composition -further comprises a carrier 
component . 

22. The method according to Claim 18, wherein said 
pharmaceutical composition comprises a fusion protein 
including said therapeutic agent and said invasion 
protein . 

23. A pharmaceutical composition comprising: a 
fusion protein comprising a therapeutic agent, an 
invasion proficient bacterial protein to effect 
delivery of the composition across the 
gastrointestinal tract and a carrier component. 
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Figur 1 

GAGTCGTACT GTGGGGAAAA CCGGCGAGAG CGAAGCGGCG GTCCATATAC 
+ + + + + 50 



CCTCCTTAAC TAAGCCAGCG GTTGCTTAGT CGCATTAGAT TAATGCATCG 
„ + + + + + 

TGAAATGCAG AGAGTCTATT TTATGAGACG AATGTAAACT ATTTTGATAA 



101 + + + t + 150 

TAATAATATA TCACAATATA TATATACATG CTAAATATAA CCTGACAATT 
151 ♦ + + + -+ 200 

AAATTAACAA GCTAATATTA CCATGATGAT TTTTTTTTTT TGCATTTCAT 
201 + + + + + 250 



TTGTCATTGC TGTTATTTTT AATTTTTTAA TTTTATTTTT GTAAGTTCTG 
+ + + + + 

CTATTCTATT GTTAGTGTTT GCGAGAGAGA AGAAGTTATT TCTTGTCGCT 
+ + + -+ + 

GTTTTCATTT CTGTTGCTTA AGTAAATATT ACCGCGTTAA TTTATACCTA 

+ + + + + 

AGGGGTACAC TAATGTATTC ATTTTTTAAT ACGCTAACTG TGACTAAAAT 



401 + + + * + 450 

MYS FFN TLTV TKI 

CATTAGCAGG CTAATATTAT CGATCGGTTT AATATTTGGA ATATTTACTT 

451 + + + + + 500 

I S R LILS IGL IFG I F T Y 



ATGGGTTCTC ACAGCAACAT TATTTTAATT CAGAAGCGTT AGAGAACCCC 

501 + + + + + 550 

GFS QQH YFNS EAL E N P 

i 

GCTGAACATA ATGAGGCTTT CAATAAGATA ATCAGTACCG GAACCAGTCT 

551 + + + + + 600 

AEHN E A F NKI ISTG TSL 



GGCGGTATCG GGTAATGCAT CCAATATCAC CAGGTCAATG GTAAATGACG 
+ + + + + 

AVS GNA5 NIT RSM VNDA 



CGGCAAATCA GGAAGTAAAA CACTGGTTAA ATAGATTTGG GACAACTCAG 

„1 + + + + + 700 

ANQ EVK HWLN RFG TTQ 



GTCAATGTTA ACTTTGATAA AAAGTTCTCC CTCAAAGAAA GTTCTCTTGA 

VNVN FDK KFS LKES SLD 

TTGGCTGTTG CCTTGGTATG ACTCTGCTTC ATATGTCTTT TTTAGTCAGT 
j. + + + 

WLL PWYD SAS YVF FSQL 
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Figure 1 (continued) 

TGGGTATAAG AAATAAAGAC AGTCGCAATA CCCTTAATAT CGGCGC7GGG 
801 + + + + + 850 

GIR NKD SRNT LNI GAG 

GTGCGTACCT TCCAACAAAG TTGGATGTAT GGCTTTAACA CTTCCTATGA 
851 + + + + + 90Q 

V R T F QQS WMY GFNT S Y D 

CAATGATATG ACTGGGCACA ATCATCGTAT TGGCGTGGGT GCAGAAGCCT 
901 + + + + 1 + 95 / Q 

NDM TGHN HRI GVG AEAW 

GGACTGATTA TTTACAATTA TCGGCCAATG GTTATTTTCG CCTCAATGGT 
951 + + + + + 1000 

T D Y LQL SANG YFR LNG 

tggcatcaat ctcgtgattt cgcggactat aatgagcgcc cggcaagcgg 
iooi + + + + — + 1050 

WHQS RDF ADY NERP ASG 

GGGCGACATT CACGTCAAAG CGTATTTACC TGCGCTGCCA CAATTGGGCG 

1051 + + + + + 1100 

GDI HVKA YLP ALF QLGG 

GGAAATTAAA ATATGAGCAG TACCGTGGTG AGCGGGTGGC TTTATTTGGT 

noi + + + + + u50 

KLK YEQ YRGE R V A LFG 

AAAGATAACC TGCAAAGTAA CCCTTATGCG GTGACCACAG GGCTTATTTA 
1151 + + + + + 12 00 

KDNL QSN PYA VTTG LIY 

TACGCCGATC CCCTTCATTA CACTGGGGGT CGATCAACGA ATGGGAAAAA 

1201 + + + ' + + 1250 

T P I PFIT L G V D'QR MGKS 

GTCGGCAGCA TGAAATACAA TGGAACTTAC AAATGGATTA TCGCCTCGGC 
1251 + + + . + + 13Q0 

RQH E I Q WNLQ MDY RLG 

GAAAGTTTTC GTTCGCAGTT TAGCCCCGCA GTGGTGGCCG GAACTCGTTT 
1301 + + + + „. + i35 0 

ESFR SQF SPA VVAG TRL 

ACTGGCTGAG AGCCGTTATA ATCTGGTTGA GCGCAATCCA AATATTGTTC 

1351 + + + + + 1400 

LAE SRYN LVE RNP N I V L 

TGGAATACCA AAAACAGAAT ACTATCAAAT TGGCATTTTC ACCCGCCGTA 

1401 + + + + + 1450 

EYQ KQN TIKL A F S PAV 

CTCTCCGGCC TGCCGGGGCA GGTTTATTCC GTTAGTGCAC AAATACAGTC 

1451 + + + + + 1500 

LSGL PGQ VYS VSAQ I Q S 
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Figure 1 (continued) 



TCAATCTGCA CTACAACGTA TTCTCTGGAA TGATGCGCAA TGGGTTGCTG 

1501 + + — + — + + 1550 

QSA LQRI LWN D A Q WVAA 

CCGGCGGCAA ATTAATACCC GTCAGTGCAA CAGATTACAA TGTGGTCTTA 

1551 + + + + + 1600 

GGK LIP VSAT DYN V V L 

CCGCCTTATA AACCGATGGC ACCAGCGAGT CGTACTGTGG GGAAAACCGG 

1601 + + + + :, + 165,0 

PPYK PMA PAS RTVG KTG 

CGAGAGCGAA GCGGCGGTCA ATACCTATAC CCTCAGCGCC ACGGCTATCG 

1651 + + + + + 1700 

E S E A A V N TYT LSA TAID 

ATAACCACGG CAATAGTTCT AATCCAGCTA CGTTGACCGT TATTGTGCAG 

1701 + + + + + 1750 

NHG NSS NPAT LTV I V Q 

CAACCTCAGT TCGTTATTAC CTCGGAAGTG ACTGATGATG GTGCGCTTGC 

1751 + + + + + 1800 

QPQF VIT S E V TDDG ALA 

TGATGGCAGG ACTCCCATCA CGGTGAAATT TACAGTGACT AATATTGATA 

1801 + + + + + 1850 

DGR TPIT VKF TVT NIDS 

GTACGCCGGT TGCCGAGCAA GAGGGGGTGA TAACCACCAG TAATGGTGCG 

1851 + + + + + 1900 

TPV AEQ EGVI TTS NGA 

CTTCCCAGTA AAGTCACAAA AAAAACCGAT GCACAGGGTG TGATAAGCAT 

1901 + + + 7 + + 1950 

LPSK VTK KTD A^GV ISI 

TGCATTAACT AGCTTCACTG TTGGGGTGTC AGTCGTCACT TTAGATATTC 

1951 + + + + + 2000 

ALT SFTV GVS VVT LDIQ 

AGGGGCAACA GGCTACTGTT GATGTACGAT TTGCCGTGCT GCCGCCAGAT 

2001 + + + + + 2050 

GQQ A T V DVRF A V L PPD 

GTCACAAACT CAAGTTTTAA CGTTTCTCCA TCTGATATTG TTGCCGATGG 

2051 + + + + + 2100 

VTNS SFN VSP SDIV ADG 

CTCCATGCAG TCGATACTCA CCTTTGTTCC GCGTAATAAA AATAATGAGT 

2101 + + + + + 2150 

SMQ SILT FVP RNK NNEF 

TTGTCAGTGG GATAACAGAT CTTGAATTTA TACAAAGTGG TGTTCCGGTA 

2151 + + + + + 2200 

VSG I7D LEFI QSG VPV 
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Figure 1 ^continu d) 



ACTATTAGTC CGGTAACCGA AAATGCTGAC AACTATACCG CCAGTGTGGT 

2201 + + + + + 2250 

TISP V T E NAD N Y T A S V V 

GGGAAATTCG GTAGGAGATG TCGATATTAC GCCGCAGGTG GGGGGGGAAT 

2251 + + + + + 2300 

(JUS VGDV DI f PQV GGES 

CACTAGACTT GTTGCAGAAA AGAATCACCC TGTACCCAGT ACCGAAGATA 

2301 + + + + + 2350/ 

LDL LQK RITL Y P V PKI 

ACCGGCATTA ACGTGAATGG TGAGCAATTT GCCACAGATA AAGGCTTCCC 

2351 + + + + + 2400 

TGIN VNG E Q F ATDK G F P 

GAAAACTACC TTTAATAAAG CCACGTTCCA ATTGGTGATG AATGACGATG 

2401 + + + + + 2450 

KTT FNKA T FQ LVM NDDV 

TGGCGAATAA TACTCAATAT GACTGGACAT CATCCTATGC GGCCAGTGCG 

2451 + + + + + 2500 

ANN TQY DWTS SYA ASA 

CCGGTTGATA ATCAGGGTAA AGTCAATATT GCCTATAAAA CCTATGGTAG 

2501 + + + + + 2550 

PVDN QGK VNI AYKT YGS 

CACCGTCACT GTGACGGCAA AAAGTAAAAA ATTCCCGAGT TATACGGCAA 

2551 + + + + +- 2600 

TVT VTAK SKK FPS YTAT 

CATATCAATT CAAACCTAAT TTGTGGGTGT TCTCCGGCAC CATGTCACTG 

YQF KPN LWVF SGT MSL 

CAATCAAGTG TCGAGGCGAG TCGAAATTGC CAGCGCACTG ATTTTACTGC 

2651 + + + + + 2700 

QSSV EAS RNC QRTD FTA 

GCTGATCGAG TCCGCACGCG CCAGTAATGG TTCGCGTTCA CCAGACGGTA 

2701 + + + + + 2750 

LIE SARA SNG SRS PDGT 

CTCTGTGGGG AGAGTGGGGA AGTTTGGCAA CCTATGATAG CGCTGAGTGG 

2751 + + + + + 2600 

LWG EWG SLAT YDS AEW 

CCATCGGGTA ACTATTGGAC TAAAAAGACC AGTACAGATT TTGTCACTAT 

2801 + + + + + 2850 

PSGN YWT KKT STDF VTM 

GGATATGACC ACCGGTGACA TACCAACATC TGCGGCTACG GCGTATCCGC 

2851 + + + + + 2900 

DMT TGDI FTS A A T AYPL 
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Figure 1 (continued) 



TGTGTGCGGA GCCGCAATAG TGCTAAATAC CAATCTTGCG GCCCAGCAAA 

2901 + + + + + . 2950 

CAE P Q * 

CTGGCACCTT TAGCGTGACC ATCTGGCC r ^ TACAGTGATT GGCCGTGGCG 
2951 + + + + ' + 3000 

CGTATTCAAA ACCGCCAGCG CCTGAGTGTT ATGCTCAATA TGCTGTTGCA 
3001 + + + + + 3050 

GCAAAAGCCC GTTATGCAGG TTGCCGTAGC GCACCGTTCG GCCAGTTCCA ' 
3051 + + + + + 3100 

AAATACGCTG CCAGCGCTCA GCTAGCGCAG GAACGTTGCT GTAGGGCGCT 
3101 + + + + + 3150 

TGAATATTTA TGTTTTTTTC GGTGGTGAGC CGGGTCTGGT CCAGATAAGC 
3151 + + + + + 3200 

CAAGGTGCCA AAATTGAACT TTTTTGTTCA GTGACGCCTT GCAACACGAT 
3201 -+ + + • + ' + 3250 

ACCTTGAATC CGACCGGAGC ACAGCAGTTG CTGCTCTTGT GCTACTACGG 
3251 + + + + + 3300 

TTTTCAGGGA TTCAAGCAGT TCCAGTTGCT GGTCCAGATT AGTTTGTAAT 
3301 + + + + + 3350 

CTTTCCACCA CCACCTATCC TTTTACGGTT AATAATTTTA CGGTCAACGA 
3351 — + + + + + 3400 

TTGTTGTGAC GTTTAGCTAT TCTTCAGGTC ATCGGCAACA TTTTTGAGCA 
3401 + + + + + 3450 

AGGCATCGGC AATTTTACCG GCGTCCATGG TCAGTTGGCC TGAACGGATC 
3451 + + + + + 3500 

GCCTGTTTTA AGGTTTCGAC ACGTTCTACA TTGATGTCCT GGCTGCCCGG 
3501 + + + + + 3550 

TTGCATCAAT TTTGCCTGCG CGTCGCTCAA TTTAACCTCA GTACCACTTA 
3551 + + + + + 3600 
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Figur* 2 

GGATCACATC ATCAATACCG AAGCCCAAGA GATTAGCCAG TGTGCCAGAA 
+ + + + + 50 



AAATGGCTCG ATGGGACGTT GGTGGAAGGA AGCAAATATT GCTTACGGCA 
51 + + + + + 100 

CGAAAACGCA TGATAGATGA GCTTCAGATG TATTTGCCAG GACTGGGAAG 
10 l + + + + + 150 

TCACGTGGGT AATTACTGTG ACATCCAGTA ATAAAACAGA GCCTCTATTA , 
151 + + + + + 200 

AAGGAGCTTC CCAATTTGAA ATCAGAAAAA TTACATCATA AACATGGGTG 
201 + + + + + 250 

TCCAGAAGTC AGTCGGCGAT ATATCCATTT AAAGAGCATT GAGCTATGAC 
251 + + + + + 300 

CAGTATTCAT CAACTACAGA ACAAAAATAC AGGAATAAGT GACTGATGGG 
301 ~- ~+ + + + + 350 

ATAAAGCTGA GGTAAGCTCA CAGTACTGTA TCAATATCCA TATTTACATA 
351 + + + + + 400 

TATATCATGG ATTTGGCATT ATATCATCAG CCATGTCAGT GATATGGTTA 
401 + + + + + 450 

TTGTATTAGT ATTGTTATAA CAATCTGGAT TATTTTTATG AAAAAGACAT 
451 + + + + + 500 

TACTAGCTAG TTCTCTAATA GCCTGTTTAT CAATTGCGTC TGTTAATGTG 

501 + + + + + 550 

AS V N V 

TACGCTGCGA GTGAAAGTAG TATTTCTATT GGTTATGCGC AAAGCCATGT 

551 + + + + + 600 

Y A A S ESS ISI G Y A Q SHV 

AAAAGAAAAT GGGTATACAT TGGATAATGA CCCTAAAGGT TTTAACCTGA 

601 + + + + + 650 

KEN G Y T L D ND PKG FNLK 



AGTACCGTTA TGAACTCGAT GATAACTGGG GAGTAATAGG TTCGTTTGCT 

651 + + + + + 

Y R Y ELD DNWG VIG SFA 

TATACTCATC AGGGATATGA TTTCTTCTAT GGCAGTAATA AGTTTGGTCA 

701 + + + + + 

Y T H Q G Y D FFY GSNK FGH 



700 



750 



TGGTGATGTT GAT TACT ATT CAGTAACAAT GGGGCCATCT TTCCGCATCA 

751 + + + + + 800 

GDV D Y Y S VTM GPS FRIN 

ACGAATATGT TAGCCTTTAT GGATTACTGG GGGCCGCTCA TGGAAAGGTT 

801 + + + + + 850 

EYV SLY GLLG AAH GKV 
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Figure 2 (continued) 

AAGGCATCTG TATTTGATGA ATCAATCAGT GCAAGTAAGA CGTCAATGGC 

851 + + + + + 900 

KASV F D E SIS ASKT SMA 

ATACGGGGCA GGGGTGCAAT TCAACCCACT TCCAAATTTT GTCATTGACG 

901 + + + + + 950 

Y G A GVQF NPL PNF VIDA 

CTTCATATGA ATACTCCAAA CTCGATAGCA TAAAAGTTGG CACCTGGATG 

951 + + + + + 1000 

S Y E Y S K LDSI KVG TW>T 

CTTGGTGCAG GGTATCGATT CTAATCATCT CAGATAGTGA AAACCCACCT 

1001 + + + + + 1050 

LGAG Y R F * 

GAGTGAAGTG AACCCCATTT ATTGGACACT TTTCCTGGCG GTTGACATGG 

1051 + + + + + 1100 

CCTGATTTCG GTACTGCACC GGACTCAGGC CGTTTAATTT TACTTTGATC 
1101 + + + + + H50 

CTTTCGTTGT TGTAGTAATG GATATACTCA TCCACCGCTT TTTTCAGTTG 
1151 + + + + + 1200 

TTCTACATCT TCGTATTTTT CATTGTGCCA GCATTCAGTC TTCAGCAGAC 
1201 + + + + 1250 

CAAAAAAGTT TTCTATCACA GCATTATCCA GGCAGTTGCC CTTGCGCGAC 
1251 + + + + + 1300 

ATACTTTGCT TTACTTCGCC AGACCCCAGC CTTTTCTTAT AGCTTGCCAT 
130 l + + + + + 1350 

CTGATATTGC CAGCCCTGAT CCGAGTGAAG TACAGGTTCA TCGCCTGAGT 
1351 + + + - 1 + + KOO 

TCAACTTCTG TAGCGCATCA TCAAGCATTT TATCAATCAG GTTCATTCCG 
1401 + + + + + 1450 

GGATGCGTAT CCATCTGCCA GGCAACGACT TCGCTGTTAT ACAGATCCAG 
145 1 + + + + + 1500 

CACGGGTGAC AGATACAGCT TTTTACCCCT GACGTTGAAC TCGG1 CACAT 
150 1 + + + + + 1550 

CGTTACCCAC TTCTGGTTAG GGGCTTCGGC AGTAAATTTT CGAGCAAGTA 
1551 + + + + + 1600 

TATTAGGGAC CACTTTACCG TAGGCACCCT GATATGACTG ATATTTTTTA 
1601 + + + + + 1650 

CGACGCAAGT TAGATGCAAG CTGCTGTTGC CGCATGAGTT TTCGTACGGT 
1651 + +. + + + 1700 

TTTATGGTTA AGACTCCCGC CCTCATTGCG TAGGGCCAGC GTTATTCTGC 
1701 + + + + + 1750 
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Figure 2 (continued) 

GGTAACCATA GCGACCTTTA TGATGGTGAA ACAGGGTTTT 'TATTCTTTGT 
1751 + + + + + 1900 

TTCTCATCCG CATAAGTCTC TTCACGACCA CTGGATTTTA CCTGCCAGTA 
1801 + — + H- + + 1*50 

GAAGGTGCTG CGCGGAAGAC CGGc~ACGTA AAGCAAGGTC GCCAGTTTAT 
1851 + + + + + 1900 

ACAGATGCCT TAATTCAGTG ATTATTCGCG TTTTTTCCGC TGCTXCTCTT 
1901 + + + + + 1950 

ACAGGTGGTA TTCACTGAGT GCCACCGATA ATGCGCAGGC AAAGTCATTA 
1951 + + + + + 2000 

ACGACCCCCG CCGCTCACCC TGAGCATGGT CGTTGATGGC TTTTATATTT 
2001 + + + + + 2050 

TCCATAGAGC AGAGGATGAT TCTTTATGTC CCGAGTGAAC TGGGGTGAAC 
2051 + — •+ + + + 2100 

GGTTATCCCG GTTTGCCGCT GAATGGCAAC GGACGGGAAT ATCCCCTAAA 
2101 + + + + + 2150 

GAGTGGTGTG AGAGAGAAGG TTATTCGTGG GGAACAGCGA AAGCGTATAT 
2151 + + + + + 2200 

TTCGATAAAA GCAGCGAAAG 
2201 + + 2220 
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Figure 3 

ATCTTTTCCG T GGTATGACC AGAACATAAA GTTTTTGCTG CCCCAACGCC 
1 + + + 50 



GGTGTCAGGC GCATAACGCC TTCCAGCCGA TfTGCACTCA TCACGCCTGG 
51 + + + + + 100 



TTCCTAGTAG GTGAAATAAC TGCTGGGG AA GAAGGCTGAT GGGGTCATGT 

101 + + + + + 150 

4 

TCTGATCAAG AATCAGCACG TTCGGCGCAA AAACGCTGGT TTGTTTGTTC 
X5! + + + + + 200 



ACTTCGCTGG TCAGCGTCAG GGTCAGTTCG CCAATGTTTG CCGGGACGCT 
201 + + + + + 250 



GTACGCAGCA ACCGGACCAC TGATGCCGGG AACGTTCAGT TGTTGGCCGC 
251 + + + + + 300 



CGGTCGCCAG TTGGGTGGTC TGGGTTTTAG ATTGATCGAC CGGTGTCCAG 
301 + + + + + 350 



GTGAGTTGTT GCAGCGCAGC AGATGGAATG GCTGGCGCGT CGCTGGTGTT 
351 + + + + + 400 



TTGCGGTACG TAGTTAACAT CGGCAAGGCT AATTCCAGGC GCGCTTGCCA 
401 + + : + + + 450 



GTAACCCTGC TGATAAACAG AGGACGATGA GACTTTTATT CATTTTCATT 
451 + + + + + 500 



GTTTTCACCT CAAAATCTGG AGCTCAGCGG TAGCCAGGCA ATAGCGCGCT 
501 + + — + + + 550 



AAACCCGATA ATCAGAGGGG CTTTCGCCCC TTCAGATAAT GACAACCTGT 
55! + + + + + 600 

TTTTATGCCG GATGCGGCGT AAACGCCTTA TCCGGCCTAC ATTTGACAGC 
601 + + — + + + 650 

CGTTGTAGGC CTGATAAGAC GCGCAAGCGT CGCATCAGGC GTTGGTTGCC 
65 1 + + + + + 700 
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Figure 3 (c ntinuad) 



'gaatgcggcg taaacgcctt atccggccca ggttttgcta ttaccaccag 

701 + + + + + 750 

ATTTCCATCT GGGCACCGAA GGTCCACTCG TCGCTGTCGC CACGACCGAA 
751 + + + + + 800 

GCTGCCGCCG TTGAAATCAG CAGGAAC GGC TTTGCCGAAG TTCGCGTTGT 
801 + + + + 850 * 

TATCAGCGTT ACCGGTGTAG TCGTAACCCC ATTTCTCATC CCACTTGGCG 
851 + + + + + 900 

TAGGTTGCGA AGACACGAAT AGCCGGGCGT G AC C AG AT GC TGTCGCCAGC 
901 + + + + + 950 

CTGCCATTGT TGTGCGAGGG TAATTTTGTA CTGATTGTTC TTGTCGCCGG 
951 + + + + + 1000 

TGCGCTGGGA TTCGACGTTG TCGTAGCCGA TTTCCATCAC GGTGCTCATG 
1001 + + + + + 1050 

ATTGGCGTCC ACTTGTACAT CGGGCGAATA CCGACGGTCC ACCACTTGGT 
1051 + + + + + 1100 

GCCGTTGTCG TTATCCCAGT TGATATCCTG GTACATACCC ACGTACATCA 
1101 + + + + + 1150 

t 

TGTCCCAGTT GTCGCCCATG GAGATCGCAC CGTGGTCGAG GATACGCAGC 
1151 — + + + + + 1200 

ATGTGACCGT TGTTGTTGAT ATTGTAGGCA AATTTTTCGT TATCAAATGC 
1201 + + + + + 1250 

AACGCCAGAA CCCTGCGACA GCCCTTTACC CTGCGAGGTC ATCGAGTCAG 
1251 + + + + + 1300 

TAGCGTACTG AACAACAAAC TTGTTAAAGC CCTTCAGGAC ACTCTGAGTA 
1301 + + + + + 1350 

TGTTCAGCAG TGAATAACCA GCCGTCTTTC GATGCGCCAT CAACCAGACG 
1351 + + + — -~+ + 1400 

ATAGTTATCA CGCAAGTTGG CACGACCGTA GTCGACACCC AGTTCTAATG 
1401 + + -f + + 1450 
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Figur 3 (continued) 

TGCCGCCCGG GTTGATTTCC ATCTGCGCTA AACGCACATC GAAAACGTCG 
1451 + + + + + 1500 



TTCGCGGTTT CGTTGGTATA GTCATAAATA TTGTTGCTGG CGAAAGAGGA 
1501 + + + + + 1550 



AGAACCACCA GCTTCAGAGG AGCGGGTTGC TGCCAGAGAG AGTTTACCGA 

1551 + + + + „ + 160,0 

AGCCAACATC GATGTTTTCC AGACCGGCAC CAGGACCAGA AATATCCCAG 

1601 + + + + : + 1650 



TAGTAGAAGT CGATCATATG AACGTCATGA CGTTGGTAGA AGCGCTTACC 
1651 + + + + + 1700 



TGCCCAGATG GTGGAGCCTG GCAGCCATTC GATCAGGTTT TTACCCTGCA 
1701 + + + + + 1750 



CGTTTGCTTC ACGGAAGGCC GGATCGGTAG CTTCCCAGTC ATTCTGTTGT 
1751 + + + + + 1800 



GCGACGGAAT AGGCCACGTT AGTGTCGAAA TAGAAGCTCT TATCGCCCTC 
1901 + + + + + 1850 



TTTCCACACT TCCTGACCCA ATTTTAATTC AGCATAAGTT TCACATTCGT 
1851 + + + T + + 1900 

TGCCAAGACG GTATTTACTT TGAGCACCGG TAGTCTGGAA ACACTGTTGT 
190 i + + + + + 1950 



TCACCGCCGC TACCTGTCCA ACCAATACCG GAACGTGCAT AGCCGTGGAA 

19S1 + + + + + 2000 

ATCAACAGCC ATTGCCTGAG CAGACATTAC GCCCGCTGCG ACGGCAACCG 

2001 + + — + + + 2050 

CCAGAGGAAG TTTGCGCAGA GTAATCATCA TTCTATCTCC TGAGTCATTG 

2051 + + -+ + + 2100 

CTTTTCTTTT TTCACATCAC CTGTGACAGG CTTTGTGTGT TTTGTGGGGT 

21 0i + + + + + 2150 



WO 96/13250 PCIYUS95/J3749 

12/33 

Figure 3 {continued) 

GCTTAAACGC CCGGCTCCTT ATGCAGTCGA CGACATGCAG TGCCATCCTC 

2151 + + + + + 2200 



ACGGAACAGA TGGCAACGCT CTGGCGGCAG GCCGATAGCG AATGTGGCAC 
2201 + + + + + 2250 



CTTCTTCTAC CAACACCACG TCGTTCTGGC GGTACACCAG GTTTTGACGA 
2251 + + + + + 2390 

ATGGAAGGGA TCTGGATATG GATTTGAGTT TCGTTGCCGA GTTGCTCGAC 
2301 + + + + + 2350 



GACCTGAACT TCACCCTCAA GGATGACGTC AGCGATATCA CTCGGCAGTA 
2351 + + + + + 2400 



GATGTTCCGG GCGAATACCC AGCGACATAT TGGCTCCAAC CTGGACATCA 
2401 + + + + + 2450 



CGGCTTTCAA CTGGCAGCCA GACTTGCTGA CGATTTGGCA TCGGCAGCTC 
2451 + + + + + 2500 



CACCTGCACT TGATCGATTG CGGTGGCGGT CACTTTTACC GGCAGGAGTT 
2501 + + + + + 2550 



CATCTTTGGC GAACCGATAA ATCCGGCGAC AAAACGGTCT GCCGGATAGT 
2551 + + + + + 2600 



GGTACAGCTA GCGGTTTCCC AACCTGCGCC ACGCGACCGG CGTCCAGCAC 
2601 + + + + + 2650 



CACGATTTTG TCGGCCAGCG TCATCGCTTC GACCTGATCG TGGGTGACGT 
2651 + + + + + 2700 



AAATCATTGT GCGGCCCAGG CGTTTATGCA GACGGGAGAT tTCGATACGC 
2701 + + + + 2750 



ATTTGCACAC GCAGTGCAGC ATCGAGGTTG GAGAGCGGTT CATCGAGCAA 
2751 + + + + + 2800 



AAATACGCTT GGCTCGGCCA CCAGCGTACG GCCAATCGCC ACACGCTGAC 
2801 + + + + + 2850 
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Figure 3 (continued) 

GCTGACCACC GGAGAGCGCT TTCGGTTTGC GATCCAGCAA ATGCGCCAGT 

2851 + _---+-— + + 2900 

TGTAGCACTT CCGCCACCTG GTTAACGCGT TGGTTAATCA CCTCTTTTTT 

2901 + + + + + 2950 

TGCGCCAGCA GGTTTCAGGC CAAATGACAT GTTTTCTGCT ACTGACAGGT 

2951 + + + + + 3000 



GGGGATAGAG CGCGTAAGAC TGAAACACCA TACCAACGCC GCGTTCTGCT 
3001 + + + + + 3050 

GGCGGAGTGT CATTCATCCG TTTCTCACCG ATGAACAGGT CGCCGCTGGT 
3051 + + + + + 3100 

GATCGTCTCA AGCCCGGGAA TCATGCGCAG TAAAGTCGAT TTACCGCAGC 
3101 + + + + + 3150 

CAGACGGTCC GACAAACACC ACGAATTCAC CTTCATGGAT ATCGAGATTG 
3151 + + + + + 3200 

ATATCTTTCG ATACCACGAC CTCGCCCCAG GCTTTCGTTA CATTTTGCAG 
3201 + + + + + 3250 

CTGTACGCTC GCCATGCCCT TCTCCCTTTG TAACAACCTG TCATCGACAG 
3251 + + + — + + 3300 

t 

CAACATTCAT GATGGGCTGA CTATGCGTCA TCAGGAGATG GCTTAAATCC 
3301 + + + + + 3350 

TCCACCCCCT GGCTTTTTTA TGGGGGAGGA GGCGGGAGGA TGAGAACACG 
3351 + + + + + 3400 

GCTTCTGTGA ACTAAACCGA GGTCATGTAA GGAATTTCGT GATG7TGCT7 
3401 + + + + + 3450 

GCAAAAATCG TGGCGATTTT ATGTGCGCAT CTCCACATTA CCGCCAATTC 
3451 + + + + + 3500 



3501 



TGTAACAGAG ATCACACAAA GCGACGGTGG GGCGTAGGGG CAAGGAGGAT 



3550 
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Figura 3 (c ntinuod) 

w " GGAAAGAGGT TGCCGTATAA AGAAACTAGA GTCCGTTTAG GTGTTTTCAC 

3551 + + + + + 3600 

GAGCACTTCA CCAACAAGGA CCATAGATTA TGAAAATAAA AACAGGTGCA 

3601 + + + + + 3650 

M K I K T G A 

CGCATCCTCG CATTATCCGC ATTAACGACG ATGATGTTTT CCGCCTQGGC 

3651 + + + + + 3700 

R I L A LSA LTT MMFS ASA 

TCTCGCCAAA ATCGAAGAAG GTAAACTGGT AATCTGGATT AACGGCGATA 

3701 — + + + + * 3750 

L A K IEEG KLV IWI NGDK 

AAGGCTATAA CGGTCTCGCT GAAGTCGGTA AGAAATTCGA GAAAGATACC 

3751 + + + + + 3800 

G Y N GLA EVGK KFE KDT 

GGAATTAAAG TCACCGTTGA GCATCCGGAT AAACTGGAAG AGAAATTCCC 

3801 + + + + + 3850 

GIKV TVE HPD KLEE KFP 

ACAGGTTGCG GCAACTGGCG ATGGCCCTGA CATTATCTTC TGGGCACACG 

3851 + + + + + 3900 

QV A ATGD GPD I I F WAHD 

ACCGCTTTGG TGGCTACGCT CAATCTGGCC TGTTpGCTGA AATCACCCCG 

3901 + + + + + 3950 

RFG GYA QSGL LAE ITP 

GACAAAGCGT TCCAGGACAA GCTGTATCCG TTTACCTGGG ATGCCGTACG 

3951 + + + + + 4000 

DKAF QDK LYP FTWD AVR 

TTACAACGGC AAGCTGATTG CTTACCCGAT CGCTGTTGAA GCGTTATCGC 

4001 + + + + + 4050 

YNG KLIA YPI AVE ALSL 

TGATTTATAA CAAAGATCTG CTGCCGAACC CGCCAAAAAC CTGGGAAGAG 

4051 + + + + + 4100 

I Y N KDL LPNP PKT WEE 

ATCCCGGCGC TGGATAAAGA ACTGAAAGCG AAAGGTAAGA GCGCGCTGAT 

4101 + + + + + 4150 

IPAL DKE LKA KGKS ALM 
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Figure 3 (continued) 

GTTCAACCTG CAAGAACCGT ACTTCACCTG GCCGCTGATT GCTGCTGACG 

4151 + + + + v ftOQ 

F N L QEPY FTW P L I A A D G 

GGGGTTATGC GTTCAAGTAT GAAAACGGCA AGTACGACAT TAAAGACGTG 

4201 + + + + + 4 250 

GYA F K Y ENGK YDI KDV 

GGCGTGGATA ACGCTGGCGC GAAAGCGGGT CTGACCTTCC TGGTTGACCT ' 
4251 + + + + + 4300 

GVDN A G A KAG LTFL V D L 

GATTAAAAAC AAACACATGA ATGCAGACAC CGATTACTCC ATCGCAGAAG 

4301 + + + + + 435 0 

I K N KHMN ADT DYS IAEA 

CTGCCTTTAA TAAAGGCGAA ACAGCGATGA CCATCAACGG CCCGTGGGCA 
4351 + — + + + + 4400 

A F N KGE TAMT I N G PWA 

TGGTCCAACA TCGACACCAG CAAAGTGAAT TATGGTGTAA CGGTACTGCC 

4401 + + + + + 4450 

WSNI DTS KVN YGVT VLP 

GACCTTCAAG GGTCAACCAT CCAAACCGTT CGTTGGCGTG CTGAGCGCAG 
4451 + + + + + 4500 

TFK GQPS KPF VGV LSAG 

GTATTAACGC CGCCAGTCCG AACAAAGAGC TGGCGAAAGA GTTCCTCGAA 

4501 + + — + + + 4550 

I N A ASP NKEL AKE FLE 

AACTATCTGC TGACTGATGA AGGTCTGGAA GCGGTTAATA AAGACAAACC 

4551 4- + + + * + 4600 

NYLL TDE G L E AVNK DKP 

GCTGGGTGCC GTAGCGCTGA AGTCTTACGA GGAAGAGTTG GCGAAAGATC 

4601 + + + + + 4650 

LGA VALK SYE EEL AKDP 

CACGTATTGC CGCCACCATG GAAAACGCCC AGAAAGGTGA AATCATGCCG 

4651 + + -+ + + 4700 

R I A ATM ENAQ KGE IMP 

AACATCCCGC AGATGTCCGC TTTCTGGTAT GCCGTGCGTA CTGCGGTGAT 

4701 + + + + + 4750 

NIPQ MSA FWY AVRT AVI 
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Figure 3 (continued) 

CAACGCCGCC AGCGGTCGTC .AGACTGTCGA TGAAGCCCTG AAAGACGCGC 

475! + + + + + 4800 

NAA SGRQ TVD EAL KDAQ 



AGACTCGTAT CACCAAGTAA TGCTGTGAAA TGCCGGATGC GGCGTGAACG 

4801 + + + + + 4850 

T R I T K * 



CCTTGTCCGG CCTACAAAAC CGAAACGTAT GTAGGCCTGA TAAGACGCGT ' 
4851 + + + + + 4900 

CAGCGTCGCA TCAGGCAGTT GTTGTCGGAT AAGGCGTGAA AGCCTTATCC 
4901 + + + + + 4950 

GTCCTGGAAT GAGGAAGAAC CCCATGGATG TCATTAAAAA GAAACATTGG 
49 51 + + + + + 5000 

TGGCAAAGCG ACGCGCTGAA ATGGTCAGTG CTAGGTCTGC TCGGCCTGCT 
5001 + + + + + 5050 

GGTGGGTTAC CTTGTTGTTT TAATGTACGC ACAAGGGGAA TACCTGTTCG 
5051 + + + — + + 5100 

CCATTACCAC GCTGATATTG AGTTCAGCGG GGCTGTATAT TTTCGCCAAT 
5101 + + + + + 5150 

CGTAAAGCCT ACGCCTGGCG CTATGTTTAC CCGGGAATGG CTGGAATGGG 
515 1 + + + + + 5200 

ATTATTCGTC CTCTTCCCTC TGGTCTGCAC CATCGCCATT GCCTTCACCA 
5201 + + + + 5250 

ACTACAGCAG CACTAACCAG CTGACTTTTG AACGTGCGCA GGAAGTGTTG 

TTAGATCGCT CCTGGCAAGC AGGCAAAACC TATAACTTTG GTCTTTACCC 
5301 + + + + + 5350 

GGCGGGCGAT GAGTGGCAAC TGGCGCTCAG CGACGGCGAA ACCGGCAAAA 
5351 + + + + + 5400 



5401 



ATTACCTCTC CGACGCTTTT AAATTTGGCG GCGAGCAAAA ACTGCAACTG 

. , r + 



5450 
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Figure 3 (continued) 

AAAGAAACGA CCGCCCAGCC CGAAGGCGAA CGCGCGAATC TGCGCGTGAT 

5 4sr ~-~~^-> + + + + 5500 

TACCCAGAAT CGTCAGGCGC TGAGTGACAT TACCGCCATT CTGCCGGATG 
5501 + + + + + 555Q 

GCAACAAAGT GATGATGAGC TCCCTGCGCC AGTTTTCTGG CACGCAGCCG 
5551 + + + + + 5600 

CTCTACACAC TCGACGGTGA CGGCACGTTG ACGAATAATC AGAGCGGCGT 
5601 + + + + + 5650 

GAAATATCGT CCGAATAACC AAATTGGCTT TTACCAGTCC ATTACCGCCG 
5651 + + + + + 57Q0 

ACGGCAACTG GGGTGATGAA AAGCTAAGCC CCGGTTACAC CGTGACCACC 
5701 + + + + + 575 0 

GGCTGGAAAA ACTTTACCCG CGTCTTTACC GACGAAGGCA TTCAGAAACC 
5751 + + + + + 5800 

GTTCCTCGCC ATTTTCGTCT GGACCGTGGT GTTCTCGCTG ATCACTGTCT 
5801 + + + + + 5850 

TTTTAACGGT GGCGGTCGGC ATGGTTCTGG CGTGTCTGGT GCAGTGGGAA 
5851 + + -+ + + 5900 

GCGTTGCGCG GCAAAGCGGT CTATCGCGTC CTGCTGATTC TGCCCTACGC 
5901 + + + + + 5950 



GGTGCCATCG TTCATTTCAA TCTTGATTTT CAAAGGGTTG TTTAACCAGA 
5951 + + + + + 6000 



GCTTCGGTGA AATCAACATG ATGTTGAGCG CGCTGTTTGG CGTGAAGCCC 
6001 -t- + + + + 6050 



GCCTGGTTCA GCGATCCGAC CACCGCCCGC ACGATGCTAA TTATCGTCAA 
6051 + + + + + 6100 



TACCTGGCTG GGTTATCCGT ACATGATGAT CCTCTGCATG GGCTTGCTGA 
6101 + + + + + 6150 



AAGCGATTCC GGACGATTTG TATGAAGCCT CAGCAATGGA TGGCGCAGGT 
6151 + + + + + 6200 



WO 96/13250 PCIYUS95/13749 

18/33 

Figure 3 '(c ntinuad) 

CCGTTCCAGA ACTTCTTTAA GATTACGCTG CCGCTGCTGA TTAAACCGCT 

6201 + + + + + 6250 

GACGCCGCTG ATGATCGCCA GCTTCGCCTT TAACTTTAAC AACTTCGTGC 

6251 + + + + + 6300 

TGATTCAACT GTTAACCAAC GGCGGCCCGG ATCGTCTTGG CACGACCACG 4 

6301 + + + + 6 3 50 

CCAGCCGGTT ATACCGACCT GCTTGTTAAC TACACCTACC GCATCGCTTT 

6351 + + + + + 6400 

TGAAGGCGGC GGGGGTCAGG ACTTCGGTCT GGCGGCAGCA ATTGCCACGC 

6401 + + + + + 6450 

TGATCTTCCT GCTGGTGGGT GCGCTGGCGA TAGTGAACCT GAAAGCCACG 

6451 + + + + + 6500 

CGAATGAAGT TTGATTAAGG GAGATAACAA AAATGGCAAT GGTCC 

6501 + + + + 6545 



I 
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MBP-INV(192) Fusion Protein 

Met Lys lie Lys Thr Gly Ala Arg He Leu Ala Leu Ser Ala Leu Thr 

15 10 15 

Thr Met Met Phe Ser Ala Ser Ala Leu Ala Lys lie Glu Glu Gly Lys 
20 25 30 

Leu Val He Trp He Asn Gly Asp Lys Gly Tyr Asn Gly Leu Ala Glu 
35 40 45 

Val Gly Lys Lys Phe Glu Lys Asp Thr Gly He Lys Val Thr; Val Glu 4 
50 55 60 

His Pro Asp Lys Leu Glu Glu Lys Phe Pro Gin Val Ala Ala Thr Gly 
65 70 75 80 

Asp Gly Pro Asp He lie Phe Trp Ala His Asp Arg Phe Gly Gly Tyr 
65 90 95 

Ala Gin Ser Gly Leu Leu Ala Glu He Thr Pro Asp Lys Ala Phe Gin 
100 105 HO 

Asp Lys Leu Tyr Pro Phe Thr Trp Asp Ala Val Arg Tyr Asn Gly Lys 
115 120 125 

Leu He Ala Tyr Pro He Ala Val Glu Ala Leu Ser Leu He Tyr Asn 
130 135 1^0 

Lys Asp Leu Leu Pro Asn Pro Pro Lys Thr Trp Glu Glu He Pro Ala 
145 150 155 160 

Leu Asp Lys Glu Leu Lys Ala Lys Gly Lys Ser Ala Leu Met Phe Asn 
165 170 175 

Leu Gin Glu Pro Tyr Phe Thr Trp Pro Leu He Ala Ala Asp Gly Gly 
180 185 , 190 

Tyr Ala Phe Lys Tyr Glu Asn Gly Lys Tyr Asp He Lys Asp Val Gly 
195 200 205 

Val Asp Asn Ala Gly Ala Lys Ala Gly Leu Thr Phe Leu Val Asp Leu 
210 215 220 

He Lys Asn Lys His Met Asn Ala Asp Thr Asp Tyr Ser He Ala Glu 
225 230 235 240 

Ala Ala Phe Asn Lys Gly Glu Thr Ala Ket Thr He Asn Gly Pro Trp 
245 250 255 

Ala Trp Ser Asn He Asp Thr Ser Lys Val Asn Tyr Gly Val Thr Val 
260 265 270 

Leu Pro Thr Phe Lys Gly Gin Pro Ser Lys Pro Phe Val Gly Val Leu 
275 280 2eS 

Ser Ala Gly He Asn Ala Ala Ser Pro Asn Lys Glu Leu Ala Lys Glu 
290 295 300 

Phe Leu Glu Asn Tyr Leu Leu Thr Asp Glu Gly Leu Glu Ala Val Asn 
305 310 315 320 
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Figure* 14 (continued) 

Lys Asp Lya Pro Leu Gly Ala Val Ala Leu Lya Ser Tyr Glu Glu Glu 
325 330 335 

Leu Ala Lys Asp Pro Arg He Ala Ala Thr Met Glu Asn Ala Gin Lys 
340 345 350 

Gly Glu lie Met Pro Asn He Pro Gin Met Ser Ala Phe Trp Tyr Ala 
355 360 365 

Val Arg Thr Ala Val He Asn Ala Ala Ser Gly Arg Gin Thr Val Asp 
370 375 380 

Glu Ala Leu Lys Asp Ala Gin Thr Arg He Thr Lys Val Pro Thr Leu 
385 390 395 400 

Thr Gly He Leu Val Asn Gly Gin Asn Phe Ala Thr Asp Lys Gly Phe 
405 410 415 

Pro Lys Thr He Phe Lys Asn Ala Thr Phe Gin Leu Gin Met Asp Asn 
420 425 430 

Asp Val Ala Asn Asn Thr Gin Tyr Glu Trp Ser Ser Ser Phe Thr Pro 
435 440 445 

Asn Val Ser Val Asn Asp Gin Gly Gin Val Thr He Thr Tyr Gin Thr 
450 455 460 

Tyr Ser Glu Val Ala Val Thr Ala Lys Ser Lys Lys Phe Pro Ser Tyr 
465 470 475 480 

Ser Val Ser Tyr Arg Phe Tyr Pro Asn Arg Trp He Tyr Asp Gly Gly 
485 490 495 

Arg Ser Leu Val Ser Ser Leu Glu Ala Ser Arg Gin Cys Gin Gly Ser 
500 505 510 

Asp Met Ser Ala Val Leu Glu Ser Ser Arg 'Ala Thr Asn Gly Thr Arg 
515 520 525 

Ala Pro Asp Gly Thr Leu Trp Gly Glu Trp Gly Ser Leu Thr Ala Tyr 
530 535 540 

Ser Ser Asp Trp Gin Ser Gly Glu Tyr Trp Val Lys Lys Thr Ser Thr 
545 550 555 560 

Asp Phe Glu Thr Met Asn Met Asp Thr Gly Ala Leu Gin Pro Gly Pro 
565 570 575 

Ala Tyr Leu Ala Phe Pro Leu Cys Ala Leu Ser He 
580 585 
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rigura 15 

MBP-AIL Fusion Protein 

Met Lys He Lys Thr Gly Ala Arg He Leu Ala Leu Sex Ala Leu Thr 

15 10 15 

Thr Met Met Phe Ser Ala Ser Ala Leu Ala Lys He Glu Glu Gly Lys 
20 25 30 

Leu Val He Trp He Asn Gly Asp Lys Gly Tyr Asn Gly Leu Ala Glu 
35 40 45 

Val Gly Lys Lya Phe Glu Lya Asp Thr Gly He Lya Val Thr Val Glu 
50 55 60 

His Pro Asp Lya Leu Glu Glu Lys Phe Pro Gin Val Ala Ala Thr Gly 
65 70 75 80 

Aap Gly Pro Aap He He Phe Trp Ala Hia Aap Arg Phe Gly Gly Tyr 
85 90 95 

Ala Gin Ser Gly Leu Leu Ala Glu He Thr Pro Aap Lys Ala Phe Gin 
100 105 HO 

Aap Lya Leu Tyr Pro Phe Thr Trp Aap Ala Val Arg Tyr Aan Gly Lya 
115 120 125 

Leu He Ala Tyr Pro lie Ala Val Glu Ala Leu Ser Leu He Tyr Aan 
130 135 140 

Lya Aap Leu Leu Pro Aan Pro Pro Lya Thr Trp Glu Glu He Pro Ala 
145 150 155 160 

Leu Aap Lya Glu Leu Lys Ala Lya Gly Lya Ser Ala Leu Met Phe Aan 

165 170 1 175 

Leu Gin Glu Pro Tyr Phe Thr Trp Pro Leu He' Ala Ala Aap Gly Gly 
180 185 190 

Tyr Ala Phe Lys Tyr Glu Aan Gly Lys Tyr Aap He Lya Asp Val Gly 
155 200 205 

Val Asp Asn Ala Gly Ala Lya Ala Gly Leu Thr Phe Leu Val Aap Leu 
210 215 220 

He Lys Aan Lya Hia Met Aan Ala Aap Thr Aap Tyr Ser He Ala Glu 
225 230 235 240 

Ala Ala Phe Aan Lys Gly Glu Thr Ala Met Thr He Asn Gly Pro Trp 
245 250 255 

Ala Trp Ser Aan He Aap Thr Ser Lys Val Asn Tyr Gly Val Thr Val 
260 265 270 

Leu Pro Thr Phe Lys Gly Gin Pro Ser Lya Pro Phe Val Gly Val Leu 
275 280 285 

Ser Ala Gly He Asn Ala Ala Ser Pro Aan Lya Glu Leu Ala Lya Glu 
290 295 300 
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S'iguz'o 15 (coatiaxaod) 

Phe Leu Glu Asn Tyr Leu Leu Thr Asp Glu Gly Leu Glu Ala Val Asn 
305 310 315 320 

Lys Asp Lys Pro Leu Gly Ala Val Ala Leu Lys Ser Tyr Glu Glu Glu 
325 330 335 

Leu Ala Lys Asp Pro Arg He Al* Ala Thr Met Glu Asn Ala Gin Lys 
340 345 350 

Gly Glu He Met Pro Asn He Pro Gin Met Ser Ala Phe Trp Tyr Ala, 
355 360 365 

Val Arg Thr Ala Val He Asn Ala Ala Ser Gly Arg Gin Thr Val Asp 
370 375 380 

Glu Ala Leu Lys Asp Ala Gin Thr Asn Ser Ser Ser Val Pro Gly Arg 
385 390 395 400 

Gly Ser He Glu Gly Arg Ala Ser Val Asn Val Tyr Ala Ala Ser Glu 
405 410 415 

Ser Ser lie Ser lie Gly Tyr Ala Gin Ser His Val Lys Glu Asn Gly 
420 425 430 

Tyr Thr Leu Asp Asn Asp Pro Lys Gly Phe Asn Leu Lys Tyr Arg Tyr 
435 440 445 

Glu Leu Asp Asp Asn Trp Gly Val He Gly Ser Phe Ala Tyr Thr His 
450 455 460 

Gin Gly Tyr Asp Phe Phe Tyr Gly Ser Asn Lys Phe Gly His Gly hsp 
465 470 475 480 

Val Asp Tyr Tyr Ser Val Thr Met Gly P.ro Ser Phe Arg He Asn Glu 
485 490 ( 495 

Tyr Val Ser Leu Tyr Gly Leu Leu Gly Ala Ala His Gly Lys Val Lys 
500 505 510 

Ala Ser Val Phe Asp Glu Ser He Ser Ala Ser Lys Thr Ser Met Ala 
515 520 525 

Tyr Gly Ala Gly Val Gin Phe Asn Pro Leu Pro Asn Phe Val He Asp 
530 535 540 

Ala Ser Tyr Glu Tyr Ser Lys Leu Asp Ser He Lys Val Gly Thr Trp 
545 550 555 560 



Met Leu Gly Ala Gly Tyr Arg Phe 
565 
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